This is the mail archive of the
pthreads-win32@sources.redhat.com
mailing list for the pthreas-win32 project.
pthread_cond_broadcast(...) leads to a deadlock
- From: Alex Kotliarov <Alex at taquote dot com>
- To: "'pthreads-win32 at sources dot redhat dot com'" <pthreads-win32 at sources dot redhat dot com>
- Date: Thu, 18 Nov 2004 11:26:43 -0500
- Subject: pthread_cond_broadcast(...) leads to a deadlock
Hi,
- my application that uses one "producer" thread and N "consumer" threads,
where N > 2,
locks up and it seems like there is a problem in implementation of the
condition variable.
- the app locks up if I use pthread_cond_broadcast(...) to unblock waiting
"consumers"
- the app does not lock if pthread_cond_signal(...) is used
- code that causes deadlock
pocedure: ptw32_cond_wait_cleanup(...)
- CV's external mutex gets locked immediately upon entering
the procedure.
- It must be locked before exiting the procedure, after
"semBlockLock" - bin.semaphore - has been posted.
- let's say that N "consumer" threads are waiting on CV and "producer"
thread broadcasts signal on that CV to wake up all consumers
given:
semBlockLock semaphore's count == 0 ( decremented in
ptw32_cond_unblock(...) )
1. - all "consumers" wake up and enter
ptw32_cond_wait_cleanup(...)
2. - one "consumer" - ALPHA - acquires CV's external mutex,
executes cleanup code, returns from pthread_cond_wait() function,
releases CV's external mutex
3. - another "consumer" acquires CV's "external" mutex and cleans
up....etc
4. - ALPHA "consumer" sees that "producer"'s work queue is empty,
decides to wait on CV again, aquires CV's mutex,
and enters pthread_cond_wait(...)
5. - there are still "consumers" to be unblocked - nWaitersToUnblock
!=0 - and they are not going anywhere, because ALPHA "consumer" holds
CV's external lock
6. ALPHA consumer executes sem_wait( semBlockLock ); and we get a
deadlock, because nWaitersToUnblock will never reach 0, and
semBlockLock semaphore will never get incremented.
- solution:
move these lines:
if ((result = pthread_mutex_lock (cleanup_args->mutexPtr))
!= 0)
{
*resultPtr = result;
return;
}
to the end of ptw32_cond_wait_cleanup procedure:
static void PTW32_CDECL
ptw32_cond_wait_cleanup (void *args)
{
.....
.....
.....
if (1 == nSignalsWasLeft)
{
if (sem_post (&(cv->semBlockLock)) != 0)
{
*resultPtr = errno;
return;
}
}
/*
* XSH: Upon successful return, the mutex has been
locked and is owned
* by the calling thread. This must be done before
any cancelation
* cleanup handlers are run.
*/
if ((result = pthread_mutex_lock
(cleanup_args->mutexPtr)) != 0)
{
*resultPtr = result;
return;
}
}
- any reason why pthread_mutex_lock (cleanup_args->mutexPtr) was moved
to the top? Algorithm 8A has this line at the bottom of
ptw32_cond_wait_cleanup()
Thanks,
Alexander Kotliarov.