This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][RFC] Allow explicit shrinking of arena heaps using anenvironment variable


On Wed, Aug 01, 2012 at 06:08:56PM +0200, Florian Weimer wrote:
> On 08/01/2012 02:27 PM, Rich Felker wrote:
> 
> >>I find it surprising that PROT_NONE does not count against the
> >>commit limit (at least for initial allocations in 2.6.32-era
> >
> >Why? PROT_NONE is not special here. All that matters is that
> >PROT_WRITE is not included.
> 
> But you can turn PROT_NONE into PROT_WRITE using mprotect.  Now it
> happens that the accounting check is delayed until the mprotect
> call, but it doesn't have to be implemented this way.

This is why mprotect can fail with ENOMEM. The same is true for
private file-backed read-only mappings that are later changed to
read-write (e.g. .text segment of a shared library with textrels). If
it consumed commit charge as soon as it was created read-only, rather
than only after being changed to read-write, every application would
consume ridiculous amounts of memory (commit charge) for no reason.

> > The same is true of read-only clean
> > anonymous maps (all zero) or read-only maps of files. The best example
> > is the program's .text/.rodata/etc. PT_LOAD segment that's read-only.
> > Except in the case of textrels (where it was temporarily made writable
> > and part or all of it was dirtied), this map does not contribute to
> > commit charge; if it did, the concept of shared program text would be
> > nearly meaningless.
> 
> It would still be an important performance optimization because you
> can share non-dirty pages between processes and use RAM more
> efficiently. You just lose the ability to conserve swap space.

And the ability to conserve commit charge (memory available to
applications for allocations). This is the most important.

> 
> >>kernels, I have not checked if applying it retroactively using
> >>mprotect, or on newer kernels).  As you explain, it is sound to do
> >>this, but the the mmap(2) manual page suggests that MAP_NORESERVE
> >>has this effect as well, except that in reality, such a mapping does
> >>count against the limit.
> >
> >MAP_NORESERVE is a historical relic that violates the principle of
> >no-overcommit. It cannot be allowed to work, because it does not only
> >affect the calling process. If memory is overcommitted, any other
> >process could later fail when the kernel is unable to satisfy the
> >memory committed to that process; this would be a serious
> >vulnerability.
> 
> The same trick as with mprotect could be applied here, the
> accounting check could be deferred until an attempt is made to dirty
> the page.
> 
> It might be a challenge to write that SIGSEGV handler, but Hotspot
> is supposed to have one that attempts to recover from the
> out-of-memory situation.  Switching to PROT_NONE allocation with
> subsequent mprotect would be vastly preferable (because it improves
> behavior in mode 2), but it is difficult to convince anyone to rely
> on the PROT_NONE behavior.

That's silly. It's a fundamental part of how commit accounting works.
In the madvise+mprotect solution, it's the MADV_DONTNEED replacing the
pages with COW copies of their original backing (the zero page) that's
a controversial assumption to make. Assuming the pages are already
clean (non-dirty) COW copies of the zero page (or a file),
mprotect(PROT_NONE) removing them from the commit charge is 100%
reliable on any system that does proper commit charge accounting.

> >>Perhaps we should add a test case for the intended mprotect behavior?
> >
> >Just make a 2gb PROT_NONE map and fork a few thousand times... :-)
> 
> Right, I think this is actually testable without bringing down the box.

Indeed, this should not consume any physical memory (except process
overhead), and should only consume commit charge if the commit charge
accounting is buggy.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]