Hi,
I had committed a fix for BZ #14652, which I now realize is
incomplete. This patch completes the fix. Details below:
A process with waiters and signallers on a condvar with a PI mutex
could hang in a deadlock when a waiter is cancelled just as it returns
from its futex call with the PI mutex held. The cancellation cleanup
handler tries to lock the mutex again and hence ends in a deadlock. My
earlier patch checked whether we already own the mutex and only call
the lock if needed.
The lock acquired by the FUTEX_WAIT_REQUEUE_PI however only sets the
owning TID in the lock value (mutex->__data.__lock) and as such, the
rest of the mutex data structure is inconsistent. The regular flow of
pthread_cond_*wait calls pthread_mutex_cond_lock_adjust to make the
data structure consistent once again. The attached patch does exactly
this.
Not having this fix results in failure of the unlock operation using
pthread_mutex_unlock with EPERM since the __user field of the mutex is
not set. This failure can theoretically be seen with tst-cond25, but
it is extremely rare. Adding PTHREAD_MUTEX_RECURSIVE to the mix
increases the chances somewhat, but not by much.
Tested on x86_64 and i686. OK to commit?
Regards,
Siddhesh
nptl/ChangeLog:
[BZ #14652]
* sysdeps/unix/sysv/linux/i386/i486/pthread_cond_timedwait.S
(__condvar_tw_cleanup): Adjust the mutex data structure if it
was locked by FUTEX_WAIT_REQUEUE_PI.
* sysdeps/unix/sysv/linux/i386/i486/pthread_cond_wait
(__condvar_w_cleanup): Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
(__condvar_cleanup2): Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
(__condvar_cleanup1): Likewise.