This is the mail archive of the pthreads-win32@sources.redhat.com mailing list for the pthreas-win32 project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

pthread_cond_broadcast(...) leads to a deadlock

From: Alex Kotliarov <Alex at taquote dot com>
To: "'pthreads-win32 at sources dot redhat dot com'" <pthreads-win32 at sources dot redhat dot com>
Date: Thu, 18 Nov 2004 11:26:43 -0500
Subject: pthread_cond_broadcast(...) leads to a deadlock

Hi,

- my application that uses one "producer" thread and N "consumer" threads,
where N > 2,
locks up and it seems like there is a problem in implementation of the
condition variable.

- the app locks up if I use pthread_cond_broadcast(...) to unblock waiting
"consumers"

- the app does not lock if pthread_cond_signal(...) is used

- code that causes deadlock
	pocedure:  ptw32_cond_wait_cleanup(...) 
		- CV's external mutex gets locked immediately upon entering
the procedure.
  		- It must be locked before exiting the procedure, after
"semBlockLock" - bin.semaphore - has been posted.

- let's say that N "consumer" threads are waiting on CV and "producer"
thread broadcasts signal on that CV to wake up all consumers

	given:
		semBlockLock semaphore's count == 0   ( decremented in
ptw32_cond_unblock(...) ) 	

	1.  - all "consumers" wake up and enter
ptw32_cond_wait_cleanup(...)
	2.  - one "consumer" - ALPHA - acquires CV's external mutex,
executes cleanup code, returns from pthread_cond_wait() function, 
	releases CV's external mutex
	3.  - another "consumer" acquires CV's "external" mutex and cleans
up....etc
	4. -  ALPHA "consumer" sees that "producer"'s work queue is empty,
decides to wait on CV again, aquires CV's mutex, 
	and enters pthread_cond_wait(...)
	5. - there are still "consumers" to be unblocked - nWaitersToUnblock
!=0 - and they are not going anywhere, because ALPHA "consumer" holds
	CV's external lock
	6. ALPHA consumer executes sem_wait( semBlockLock ); and we get a
deadlock, because nWaitersToUnblock  will never reach 0, and 
	semBlockLock semaphore will never get incremented.

- solution:
	move  these lines:

		  if ((result = pthread_mutex_lock (cleanup_args->mutexPtr))
!= 0)
   		 {
     			 *resultPtr = result;
     			 return;
   		 }
	
	to the end of   ptw32_cond_wait_cleanup procedure:

		static void PTW32_CDECL
		ptw32_cond_wait_cleanup (void *args)
		{
			.....
			.....
			.....
  			if (1 == nSignalsWasLeft)
    			{
     				 if (sem_post (&(cv->semBlockLock)) != 0)
				{
	 				 *resultPtr = errno;
	 				 return;
				}
    			}
 			 /*
   			* XSH: Upon successful return, the mutex has been
locked and is owned
   			* by the calling thread. This must be done before
any cancelation
  			 * cleanup handlers are run.
  			 */
  			if ((result = pthread_mutex_lock
(cleanup_args->mutexPtr)) != 0)
   			 {
      				*resultPtr = result;
      				return;
    			}
		}	

   - any reason why  pthread_mutex_lock (cleanup_args->mutexPtr) was moved
to the top? Algorithm 8A has this line at the bottom of 
	ptw32_cond_wait_cleanup()


   Thanks,

   Alexander Kotliarov.

Follow-Ups:
- Re: pthread_cond_broadcast(...) leads to a deadlock
  - From: Ross Johnson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]