This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

semi-solved: fork-related access violations on win7-x64


Hi all,

I've isolated one source of access violations on my win7-x64 machine, and it's nasty.

The offending series of events is:
1. Two linked-in dlls share the same base address
2. The process forks
3. Windows assigns the child's dll a different base addresses than it chose for the parent
4. This code from dll::init () (dll_init.cc) runs in the child, with p addresses from the parent:
1.75 (07-May-10): /* This should be a no-op. Why didn't we just import this variable? */
1.78 (27-Mar-11): if (!p.envptr)
1.78 (27-Mar-11): p.envptr = &__cygwin_environ;
1.79 (06-Apr-11): else if (*(p.envptr) != __cygwin_environ)
1.78 (27-Mar-11): *(p.envptr) = __cygwin_environ;

It was only recently that "somebody" noticed that the envptr could be wrong and added code to "fix" it, but that leaves all the other members of p just as wrong as before. If we're lucky, p points to unmapped memory, causing one access violation; otherwise, we jump off into la-la land and do who-knows-what with bad addresses.


It was trivial to make dll_list::alloc() call api_fatal() when it detects a parent/child handle mismatch; whatever spawns the child process is apparently willing to try as many as six times before giving up. Six retries gives 8/60 around 85% success rate for my toy benchmark, suggesting that Windows 7 has ~25% probability of resolving a conflicting dll base address the same way twice in a row. This varies all over the map, tho: sometimes fork() succeeds in one try quite a few times in a row; or it may fail completely as many times in a row, with 2-3 failures being the most common.

Unfortunately, the failed forks don't quite go away cleanly, since a static destructor from one of my two conflicting dlls tries to run (and fails), as does some cygwin-related finalization:
14428 [main] fork 6148 C:\cygwin\home\Ryan\experiments\fork-tests\fork.exe: *** fatal error - Location of C:\cygwin\home\Ryan\experiments\fork-tests\cygfoo.dll changed from 0x3A0000 (parent) to 0x320000 (child)
Stack trace:
Frame Function Args
0027B45C 610294DB (0027B45C, 00000000, 00000000, 00000000)
0027B74C 610294DB (00000001, 00008000, 00000000, 61184ADA)
0027C77C 61005E37 (611AC5E8, 0027C7A4, 003A0000, 00320000)
0028C7AC 61022626 (611E2440, 00320000, 00324078, 00000002)
0028C7EC 61022814 (0028F9F0, 0028C828, 6102271D, 00000000)
End of stack trace
* * * (null) fini
CloseHandle(win32_obj_id<0x104>) failed virtual pthread_mutex::~pthread_mutex(): 1585, Win32 error 6

So... is there any way to unload a DLL_LINK and "encourage" it to go to the right place? Alternatively, is there a quieter way to kill off failed child processes? I would imagine that only a rebase can make a DLL_LINK always go where it belongs on the first try, much as I despise the temporary band-aid that is rebasing.


Regards,
Ryan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]