semi-solved: fork-related access violations on win7-x64
Ryan Johnson
ryan.johnson@cs.utoronto.ca
Sat Apr 16 06:03:00 GMT 2011
Hi all,
I've isolated one source of access violations on my win7-x64 machine,
and it's nasty.
The offending series of events is:
1. Two linked-in dlls share the same base address
2. The process forks
3. Windows assigns the child's dll a different base addresses than it
chose for the parent
4. This code from dll::init () (dll_init.cc) runs in the child, with p
addresses from the parent:
> 1.75 (07-May-10): /* This should be a no-op. Why didn't we
> just import this variable? */
> 1.78 (27-Mar-11): if (!p.envptr)
> 1.78 (27-Mar-11): p.envptr = &__cygwin_environ;
> 1.79 (06-Apr-11): else if (*(p.envptr) != __cygwin_environ)
> 1.78 (27-Mar-11): *(p.envptr) = __cygwin_environ;
It was only recently that "somebody" noticed that the envptr could be
wrong and added code to "fix" it, but that leaves all the other members
of p just as wrong as before. If we're lucky, p points to unmapped
memory, causing one access violation; otherwise, we jump off into la-la
land and do who-knows-what with bad addresses.
It was trivial to make dll_list::alloc() call api_fatal() when it
detects a parent/child handle mismatch; whatever spawns the child
process is apparently willing to try as many as six times before giving
up. Six retries gives 8/60 around 85% success rate for my toy benchmark,
suggesting that Windows 7 has ~25% probability of resolving a
conflicting dll base address the same way twice in a row. This varies
all over the map, tho: sometimes fork() succeeds in one try quite a few
times in a row; or it may fail completely as many times in a row, with
2-3 failures being the most common.
Unfortunately, the failed forks don't quite go away cleanly, since a
static destructor from one of my two conflicting dlls tries to run (and
fails), as does some cygwin-related finalization:
> 14428 [main] fork 6148
> C:\cygwin\home\Ryan\experiments\fork-tests\fork.exe: *** fatal error -
> Location of C:\cygwin\home\Ryan\experiments\fork-tests\cygfoo.dll
> changed from 0x3A0000 (parent) to 0x320000 (child)
> Stack trace:
> Frame Function Args
> 0027B45C 610294DB (0027B45C, 00000000, 00000000, 00000000)
> 0027B74C 610294DB (00000001, 00008000, 00000000, 61184ADA)
> 0027C77C 61005E37 (611AC5E8, 0027C7A4, 003A0000, 00320000)
> 0028C7AC 61022626 (611E2440, 00320000, 00324078, 00000002)
> 0028C7EC 61022814 (0028F9F0, 0028C828, 6102271D, 00000000)
> End of stack trace
> * * * (null) fini
> CloseHandle(win32_obj_id<0x104>) failed virtual
> pthread_mutex::~pthread_mutex(): 1585, Win32 error 6
So... is there any way to unload a DLL_LINK and "encourage" it to go to
the right place? Alternatively, is there a quieter way to kill off
failed child processes? I would imagine that only a rebase can make a
DLL_LINK always go where it belongs on the first try, much as I despise
the temporary band-aid that is rebasing.
Regards,
Ryan
More information about the Cygwin-developers
mailing list