Debugging help for fork failure: resource temporarily unavailable

Ryan Johnson ryanjohn@ece.cmu.edu
Tue Mar 15 15:04:00 GMT 2011


On 2:59 PM, Jon TURNEY wrote:
> On 09/03/2011 17:04, Ryan Johnson wrote
>> BTW, while looking at the code I noticed a potential source of remap problems:
>> if B depends on A, and we remap A first, then only A's location will be
>> checked carefully; B will be pulled in wherever it happens to end up when we
>> do the full load of A. The code seems to assume that every DLL we try to remap
>> is currently not loaded.
>>
>> I'm actually not sure what would happen when time came to remap B, because
>> loading it would just return the handle we didn't know we had, and closing
>> that handle wouldn't take its reference count to zero.
> I too have idly mused that there might be an issue with dependent DLLs here.
>
> But, since dll_list::load_after_fork() walks the dll list in the same order as
> the dlopen() calls occur, I've never been able to convince myself there is a
> real problem, barring esoteric scenarios like: B depends on A, C depends on A,
> load B, load C (C collides with A so loads at non-preferred address), unload
> B, fork
Oh, I see what you mean... in theory, asking Windows to load the same 
dlls in the same order should put them at the same addresses.
> That doesn't match what really happens though: where problems are seen it's
> often with python or perl, which dynamically load libraries when modules are
> imported, but won't unload them in normal use.
All of this assumes Windows is consistent in choosing locations when 
conflicts are involved. IOW, consider the case that B depends on A, with 
A and B both conflicting with a later-loaded C. The first time A and C 
load Windows will choose alternate locations for them, and if that order 
changes in the child, it's totally possible that A ends up in the child 
where C was in the parent.

>> Incidentally, this
>> same problem would arise if a BLODA injected a DLL into the process -- that
>> DLL would be on the todo list for fork() to process (because it was also
>> injected into the parent process), but would already be loaded by the time we
>> try to remap it. Also, if we do want to force Windows not to put a dll in a
>> certain address, wouldn't it make more sense to reserve the (wrong) space it
>> went into on the first try? Right now if the offending location is higher than
>> the one we want, nothing stops Windows from just putting it right back in its
>> old spot because the code only reserves locations lower than the desired one.
>>
>> Is this accurate or am I missing something here?
> I'm not sure that particular scenario with injected DLLs is possible, as the
> list traversed in dll_list::load_after_fork() is only of dynamically loaded
> cygwin-based DLLs?
Oh, so injected dlls, though not statically linked in, still wouldn't be 
on this list?

BTW, I found a good way to identify, if not fix, BLODA: given an app 
which loads no libraries at runtime -- such as 'ls' -- any dlls 
mentioned in /proc/$$/maps which cygcheck does not mention are probably 
dodgy. In my case, Windows Live (which I didn't think was even installed 
on my machine) has injected a WLIDNSP.DLL ("Microsoft Windows Live ID 
Namespace Provider") in all my processes.

> $ objdump -p /usr/bin/cygpyglib-2.0-python2.6-0.dll | grep ^ImageBase
> ImageBase               6aa40000
>
> $ objdump -p /usr/bin/cygglib-2.0-0.dll | grep ^ImageBase
> ImageBase               6aa40000
>
> C:\cygwin\bin\cygglib-2.0-0.dll @ 0x6AA40000 using DONT_RESOLVE_DLL_REFERENCES
>     1263 [main] python 3008 dll_list::load_after_fork: reserve_upto 0x18C40000
> to try to force it to load there
>     1473 [main] python 3008 dll_list::load_after_fork: LoadLibrary
> C:\cygwin\bin\cygglib-2.0-0.dll @ 0x6AA40000 using DONT_RESOLVE_DLL_REFERENCES
>     1620 [main] python 3008 C:\cygwin\bin\python.exe: *** fatal error - unable
> to remap C:\cygwin\bin\cygglib-2.0-0.dll to same address as parent: 0x18C40000
> != 0x6AA40000
>
> and I've confirmed that in the parent, cygpyglib-2.0-python2.6-0.dll loads at
> 0x6AA40000 and cygglib-2.0-0.dll loads at 0x18C40000.
>
> At a wild guess, it looks like LoadLibraryEx() maps DLLs into memory starting
> from the top of the dependency chain, but then calls the DLL's entry point
> starting from the bottom of the dependency chain (which makes all kinds of
> sense, but leads to this inversion of the load order in the child)
>
So the problem basically arises because dlls in the child are not 
actually loaded in the same order as in the parent?  In this case I 
assume that cygpyglib depends on cygglib, which suggests that we could 
avoid a lot of trouble by handling dependent children first.

Also, it looks like the above is exactly the case I suspected -- the 
offending dll attempts to load *higher* than where we want it, so 
reserving space below does nothing for us.
>> I assume there's a way to enumerate the dlls loaded in a given process; would
>> it make sense to use a three-step algorithm?
>> 1. Unload all currently-loaded dlls, complaining loudly to stderr or a log
>> file (these are due to BLODA and deserve to be called out)
>> 2. Load without deps every DLL and make sure it lands at the right address
>> (using memory reservation tricks if needed)
>> 3. Reload with deps every DLL. Presumably once it has landed correctly once it
>> will do so thereafter (the current code assumes this, at least)
> Doing 2&  3 is an interesting idea, the first call to let you pin it at a
> particular address and the second to make it executable.
>
> I've no idea what happens, but unfortunately, the comments in
> dll_list::load_after_fork() seem to suggest this doesn't work, as the DLLs
> entry point doesn't get called the second time it's loaded.
The code currently unloads the library completely and the reloads it 
normally, which I assumed was to ensure entry points get called.

>> In theory, the first step might allow cygwin to resist dll injection (maybe on
>> an opt-out basis?), though I don't know what the consequences of that choice
>> would be.
>>
>> The third step would be significantly easier if we had a dependency graph so
>> that we could ensure dependencies always get processed before they're needed,
>> but I don't know if that's feasible. How expensive/embeddable is cygcheck?
> Another idea (assuming my guess about LoadLibrary() behaviour above is
> correct) would be to have dlopen() rather than simply call LoadLibrary() on a
> DLL, construct the dependency tree of the DLL it's been asked to open and load
> the DLLs starting from the bottom, so that the order of loading into memory
> matches the order which entry points are called (and hence the order in
> dll_list)? (This would have the advantage of not making fork() even more
> heavyweight)
Some variant of objdump -p $THE_DLL  | grep 'DLL Name' ?

It might also make sense for the parent process to record some ordering 
information at dlopen time in case it forks later. Given that the dlls 
are opening anyway it would probably be cheap to do it then. Just build 
a tree of all dlls which the current dlopen() triggers dlopen() calls 
for. Alternatively (simpler?) just make dlopen() add dlls to its list 
just before it returns. That way, any recursive calls will add the 
dependencies to the list first. No special data structures needed. Only 
problem is, I can't see where in the source this magical list is 
generated in the first place :(

> Alternatively, maybe all that is needed is a slightly more complex approach to
> forcing the DLL to load at a particular address?  If reserve_upto() has been
> called, but it loads higher than that, can we assume load order inversion has
> occurred, and try to to block it from loading at it's preferred address by
> VirtualAlloc()-ing there as well? I think I might even try to write a patch to
> do that...
The second approach might be easier to hack together quickly, but the 
first would actually make fork() more efficient and eliminate a lot of 
code: it's likely that all the rebasing/remapping fallbacks could 
disappear.

A third alternative would be to traverse the remaining list of dlls and 
find the one that we should have loaded first. This would have to be 
recursive to handle the case where several dlls map to the same base, 
but might otherwise be workable.

Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list