This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
semi-solved: fork-related access violations on win7-x64
- From: Ryan Johnson <ryan dot johnson at cs dot utoronto dot ca>
- To: cygwin-developers at cygwin dot com
- Date: Sat, 16 Apr 2011 02:02:54 -0400
- Subject: semi-solved: fork-related access violations on win7-x64
Hi all,
I've isolated one source of access violations on my win7-x64 machine,
and it's nasty.
The offending series of events is:
1. Two linked-in dlls share the same base address
2. The process forks
3. Windows assigns the child's dll a different base addresses than it
chose for the parent
4. This code from dll::init () (dll_init.cc) runs in the child, with p
addresses from the parent:
1.75 (07-May-10): /* This should be a no-op. Why didn't we
just import this variable? */
1.78 (27-Mar-11): if (!p.envptr)
1.78 (27-Mar-11): p.envptr = &__cygwin_environ;
1.79 (06-Apr-11): else if (*(p.envptr) != __cygwin_environ)
1.78 (27-Mar-11): *(p.envptr) = __cygwin_environ;
It was only recently that "somebody" noticed that the envptr could be
wrong and added code to "fix" it, but that leaves all the other members
of p just as wrong as before. If we're lucky, p points to unmapped
memory, causing one access violation; otherwise, we jump off into la-la
land and do who-knows-what with bad addresses.
It was trivial to make dll_list::alloc() call api_fatal() when it
detects a parent/child handle mismatch; whatever spawns the child
process is apparently willing to try as many as six times before giving
up. Six retries gives 8/60 around 85% success rate for my toy benchmark,
suggesting that Windows 7 has ~25% probability of resolving a
conflicting dll base address the same way twice in a row. This varies
all over the map, tho: sometimes fork() succeeds in one try quite a few
times in a row; or it may fail completely as many times in a row, with
2-3 failures being the most common.
Unfortunately, the failed forks don't quite go away cleanly, since a
static destructor from one of my two conflicting dlls tries to run (and
fails), as does some cygwin-related finalization:
14428 [main] fork 6148
C:\cygwin\home\Ryan\experiments\fork-tests\fork.exe: *** fatal error -
Location of C:\cygwin\home\Ryan\experiments\fork-tests\cygfoo.dll
changed from 0x3A0000 (parent) to 0x320000 (child)
Stack trace:
Frame Function Args
0027B45C 610294DB (0027B45C, 00000000, 00000000, 00000000)
0027B74C 610294DB (00000001, 00008000, 00000000, 61184ADA)
0027C77C 61005E37 (611AC5E8, 0027C7A4, 003A0000, 00320000)
0028C7AC 61022626 (611E2440, 00320000, 00324078, 00000002)
0028C7EC 61022814 (0028F9F0, 0028C828, 6102271D, 00000000)
End of stack trace
* * * (null) fini
CloseHandle(win32_obj_id<0x104>) failed virtual
pthread_mutex::~pthread_mutex(): 1585, Win32 error 6
So... is there any way to unload a DLL_LINK and "encourage" it to go to
the right place? Alternatively, is there a quieter way to kill off
failed child processes? I would imagine that only a rebase can make a
DLL_LINK always go where it belongs on the first try, much as I despise
the temporary band-aid that is rebasing.
Regards,
Ryan