Improvements to fork handling

Ryan Johnson ryan.johnson@cs.utoronto.ca
Wed May 11 14:21:00 GMT 2011


On 11/05/2011 10:13 AM, Christopher Faylor wrote:
> On Wed, May 11, 2011 at 09:59:53AM +0200, Corinna Vinschen wrote:
>> On May 11 02:18, Ryan Johnson wrote:
>>> Please find attached five patches [...]
>> Oops, wrong mailing list...
>>
>> Btw., it would be nice if you could create patches with the diff -p flag
>> as well.  It's not exactly essential, but IMHO it's quite a help when
>> trying to review patches.
>>
>> Another problem is this:  While you provide separate patches, you don't
>> provide separate ChangeLogs.  That makes it kind of hard to apply them
>> separately.  Would you mind to create one ChangeLog per change?
> Ditto.  This really needs to be broken down into easier to review chunks.
All right. Let's try this again with the correct mailing list.

The patches have been generated with diff -p, and each includes 
appropriate changelog entries. Hopefully the changes are split up finely 
enough because I don't know a good way to break them down any further.

For posterity's sake I'm including the original message body below.

Ryan

Please find attached five patches which improve the behavior of forking 
when Windows isn't cooperating as well we'd like. The results are not as 
good as I'd originally hoped for, in that it's still entirely possible 
(even common) for fork attempts to fail, but at least now they are clean 
failures. Most sources of access violations should be gone, address 
space clobbers lead to clean child exit, and retries are applied 
consistently. It will still be important both to rebase and to 
ASLR-enable dlls, however, because there are too many sources of address 
space clobbers which we really can't control.

One open issue remains: windows dlls, thread stacks, and heaps can and 
do end up at different locations in the child. This technically breaks 
fork semantics but I don't know whether we care.  Since we currently 
have no real way to track this or compensate for it in the absence of 
obvious address space clobbers, the question is probably moot in any case.

The first patch (fork-clean-exit) allows a child which failed due to 
address space clobbers to report cleanly back to the parent. As a 
result, DLL_LINK which land wrong, DLL_LOAD whose space gets clobbered, 
and failure to replicate the cygheap, generate retries and dispense with 
the terminal spam. Handling of unexpected errors should not have 
changed. Further, the patch fixes several sources of access violations 
and crashes, including:
- accessing invalid state after failing to notice that a 
statically-linked dll loaded at the wrong location
- accessing invalid state while running dtors on a failed forkee. I 
follow cgf's approach of simply not running any dtors, based on the 
observation that dlls in the parent (gcc_s!) can store state about other 
dlls and crash trying to access that state in the child, even if they 
appeared to map properly in both processes.
- attempting to generate a stack trace when somebody in the call chain 
used alloca(). This one is only sidestepped here, because we eliminate 
the access violations and api_fatal calls which would have triggered the 
problematic stack traces. I have a separate patch which allows offending 
functions to disable stack traces, if folks are interested, but it was 
kind of noisy so I left it out for now (cygwin uses alloca pretty 
liberally!).

The second (fork-topsort) has the parent sort its dll list topologically 
by dependencies. Previously, attempts to load a DLL_LOAD dll risked 
pulling in dependencies automatically, and the latter would then not 
benefit from the code which "encourages" them to land in the right 
places. The dependency tracking is achieved using a simple class which 
allows to introspect a mapped dll image and pull out the dependencies it 
lists. The code currently rebuilds the dependency list at every fork 
rather than attempt to update it properly as modules are loaded and 
unloaded. Note that the topsort optimization affects only cygwin dlls, 
so any windows dlls which are pulled in dynamically (directly or 
indirectly) will still impose the usual risk of address space clobbers.

The third (fork-reserve-at) fixes a bug in the reserve_at function which 
caused it to sometimes reserve space needed by the dll it was supposed 
to help land. This happens when the dll tries to land in a free region 
which overlaps the desired location. The new code exploits the image 
introspection to get the dll's image size and avoids the corner cases.

The fourth (fork-dll-load) provides a rewrite to dll_list::load_after 
fork. The new version eliminates reserve_upto() and release_upto(), 
which were expensive (the process repeats for each dll) and buggy 
(release_upto could free allocations reserve_upto did not make). 
Instead, the effect of reserve_upto is achieved by recursively 
attempting to load each dll in its proper place and calling reserve_at 
before retrying; each reservation's location is kept on the stack 
throughout and release_at calls are made only when the recursion unwinds 
after all dlls have loaded. Further, the code (exploiting image 
introspection again) pre-reserves all space needed by each DLL_LOAD 
before starting the normal load process. This allows us to detect early 
whether Windows clobbered something from the start (allowing retry) and 
also ensures that the needed address space is not clobbered by later 
calls to reserve_at or by dlls allocating resources.

The fifth and final patch (fork-badd-addr) adds a small optimization 
which reserves the lower 4MB of address space early in the process's 
lifetime (even if it's not a forkee). This was motivated by the 
observation that Windows tends to move things around a lot in that area, 
increasing the probability of future fork failures if the parent allows 
cygwin dlls to land there.  The patch does not fully address the 
problem, however, because ASLR can move things around even in higher 
addresses. This patch is optional: it should be harmless but may or may 
not improve fork success rates: most fork failures for me involve 
DLL_LINK dlls which landed badly in the child.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fork-clean-exit.patch
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20110511/1329a979/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fork-topsort.patch
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20110511/1329a979/attachment-0001.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fork-reserve-at.patch
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20110511/1329a979/attachment-0002.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fork-dll-load.patch
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20110511/1329a979/attachment-0003.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fork-bad-addr.patch
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20110511/1329a979/attachment-0004.ksh>


More information about the Cygwin-patches mailing list