pthread_cancel+pthread_join problems when a thread executes "accept" TCP function

jdzstz - gmail dot com jdzstz@gmail.com
Thu Jan 27 22:22:00 GMT 2011


I have detected an issue with pthread_join when the thread is
executing "accept" TCP function.

If a thread is blocked in an accept TCP call, and a "pthread_cancel"
and "pthread_join" are called from parent, in cygwin, the
"pthread_join" is blocked forever until "accept" ends (that maybe
never end) so the program gets blocked forever.  In Linux and Solaris,
it works fine (see below)

The problem is happening in one of the tools of varnish cache program.
I have extracted part of the code and created program test.

The program does the following:
   1.  calls   socket(AF_INET, SOCK_STREAM, 0);  ,   bind(sd, (struct
sockaddr *) &saddr, sizeof(saddr)   and listen(sock, 1)
   2.  execute  rc = pthread_create(&thr, NULL, server_thread, (void *)t);
   3.  server_thread => executes "accept" and gets blocked until
somebody connects to port.
   4.  server_thread => if receives an incomming conection, closes the
connection and exits.
   5.  main thread => execute sleep x seconds (program argument)
   6.  main thread => execute pthread_cancel  and pthread_join
   7.  main thread => closes socket

I have made some tests in Linux 2.6.16.60 , Solaris 10, Cygwin 1.7.7
and Cygwin 1.7.8s(0.235/5/3) 20110114

* In Linux, when "pthread_cancel"+"pthread_join" are executed, if
thread is blocked in accept call, it is destroyed immediately,
"pthread_join" returns 0. and tread returns "0xffffffffffffffff"
* In Solaris, when "pthread_cancel"+"pthread_join" are executed, if
thread is blocked in accept call, accept is aborted and returns error,
but rest of thread is executed ok, and pthread_join ends when thread
returns.
* In Cygwin (both versions), when "pthread_cancel"+"pthread_join" are
executed, if thread is blocked in accept call, pthread_join is also
blocked, forever (or until accept is unblocked)

==== SOLARIS ====
$ /tmp/thread_accept_test 10
Start main
Opening socket on 0.0.0.0 61002
Main: Creating thread
Start sleep
Starting server thread, executing "accept"
End sleep
pthread_cancel
Waiting for server
Accepted failed: Error 0
Ending thread
Server returned "NULL"
End main
$

==== LINUX ====
$ /tmp/thread_accept_test 10
Start main
Opening socket on 0.0.0.0 50636
Main: Creating thread
Start sleep
Starting server thread, executing "accept"
End sleep
pthread_cancel
Waiting for server
Server returned "0xffffffffffffffff"
End main
$

==== CYGWIN ====
$ /tmp/thread_accept_test.exe 10
Start main
Opening socket on 0.0.0.0 3940
Main: Creating thread
Starting server thread, executing "accept"
Start sleep
End sleep
pthread_cancel
Waiting for server

<it never ends>


After searching solutions in google, I founded a workaround for
cygwin, if I send a signal to thread, it unblocks "accept" call:
(void)pthread_kill(thr,SIGUSR1);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: thread_accept_test.c
Type: text/x-csrc
Size: 3407 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20110127/e29f929c/attachment.bin>
-------------- next part --------------
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


More information about the Cygwin mailing list