1.5.25-15: pthread_join deadlocks

Ryan Johnson ryanjohn@ece.cmu.edu
Thu Nov 19 12:45:00 GMT 2009


Hi all,

I'm hitting a deadlock with cygwin pthreads when joining on a 
short-lived thread -- for me the second such thread creation will almost 
never return. It looks *exactly* like a problem that others noticed as 
far back as early 2005 [0], and  from the output of strace on the test 
case (below) the culprit is almost certainly a racy optimization in 
__cygwin_lock_* for which a patch was submitted six months ago [1].

As of today my cygwin distribution is completely up to date. Any hope of 
an update coming out soon?

Regards,
Ryan

[0] 
http://coding.derkeiler.com/Archive/General/comp.programming/2005-02/0786.html
[1] http://www.mail-archive.com/cygwin-patches@cygwin.com/msg04323.html

$ cat bug.cpp
#include <pthread.h>
#include <cassert>
#include <cstdio>
#define ANNOUNCE(what) fprintf(stderr, what "\n")
extern "C" void* run(void*) {
    ANNOUNCE("Running");
    return 0;
}
int main() {
    pthread_t tid;
    ANNOUNCE("Starting");
    for(int i=0; i < 10; i++) {
        ANNOUNCE("Creating thread");
        assert(0 == pthread_create(&tid, 0, &run, 0));
        ANNOUNCE("Joining thread");
        assert(0 == pthread_join(tid, 0));
    }
    ANNOUNCE("Done");
}

$ g++ -Wall -g -mthreads bug.cpp && strace 
--mask=all+thread+paranoid+debug+uhoh ./a.exe
**********************************************
Program name: C:\cygwin\home\Ryan\experiments\a.exe (pid 2860, ppid 1)
App version:  1005.25, api: 0.156
DLL version:  1005.25, api: 0.156
DLL build:    2008-06-12 19:34
OS version:   Windows NT-5.1
Heap size:    402653184
Date/Time:    2009-11-19 11:58:33
**********************************************
   48   32123 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
   24   32147 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
   25   32172 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
Starting
   24   33868 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
   23   33891 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
Creating thread
   23   34109 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
  260   34369 [unknown (0x1C8C)] a 2860 pthread::thread_init_wrapper: 
started thread 0x100428B0 0xD8D008 0x61102D90 0x100428B0 0x401145 0x0
   33   34402 [unknown (0x1C8C)] a 2860 __cygwin_lock_lock: threadcount 
2.  locking
   39   34561 [main] a 2860 __cygwin_lock_lock: threadcount 2.  locking
Running
   91   34860 [unknown (0x1C8C)] a 2860 __cygwin_lock_unlock: 
threadcount 2.  unlocked

***** Child thread exits here *****

Joining thread

***** Main thread decides it doesn't need to release the lock *****

   22   35166 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
   56   35222 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
Creating thread
   24   36990 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
  156   37146 [unknown (0x10B8)] a 2860 pthread::thread_init_wrapper: 
started thread 0x100428B0 0xD8D008 0x61102D90 0x100428B0 0x401145 0x0
   29   37175 [unknown (0x10B8)] a 2860 __cygwin_lock_lock: threadcount 
2.  locking

***** Second child thread now blocked the lock which main thread holds *****

   25   37200 [main] a 2860 __cygwin_lock_lock: threadcount 2.  locking
Joining thread

***** Apparently recursive lock acquires work? *****

   25   38604 [main] a 2860 __cygwin_lock_unlock: threadcount 2.  unlocked


***** Unfortunately main still holds the lock and is now joined on the 
child it blocks *****


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list