This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH][BZ #14417] Unlock mutex before going back to waitingfor PI mutexes
- From: Torvald Riegel <triegel at redhat dot com>
- To: Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Sat, 29 Sep 2012 20:32:32 +0200
- Subject: Re: [PATCH][BZ #14417] Unlock mutex before going back to waitingfor PI mutexes
- References: <20120921195115.2319b183@spoyarek>
Siddhesh,
conceptually, I think the patch is okay. I do have a few
questions/comments regarding the implementation, however.
(Disclaimer: I don't deal with asm code often, so I might have missed a
thing or two.)
On Fri, 2012-09-21 at 19:51 +0530, Siddhesh Poyarekar wrote:
> /* Normal and PI futexes dont mix. Use normal futex functions
> only
> if the kernel does not support the PI futex functions. */
> cmpl $-ENOSYS, %eax
Perhaps add a comment here too that says that you only expect 0,
-EAGAIN, and -ENOSYS? (Or perhaps add an assertion right away?)
> @@ -204,12 +200,12 @@ __pthread_cond_wait:
> cmpl 8(%esp), %edx
> jne 7f
> cmpl 4(%esp), %edi
> - je 8b
> + je 22f
>
> 7: cmpl %ecx, %edx
> jne 9f
> cmp %eax, %edi
> - je 8b
> + je 22f
>
> 9: addl $1, woken_seq(%ebx)
> adcl $0, woken_seq+4(%ebx)
> @@ -285,6 +281,22 @@ __pthread_cond_wait:
> jmp 20b
>
> cfi_adjust_cfa_offset(-FRAME_SIZE);
> +
> + /* We need to go back to futex_wait. If we're using
> requeue_pi, then
> + release the mutex we had acquired and go back. */
I think the point here is not just that we're using requeue_pi, but that
we did that AND 16(%esp) is true iff the syscall did not return EAGAIN.
Then I'd prefer to either have a comment why we need to do this (in
comparison to the non-PI-aware base algorithm), or refer to the x86_64
code for additional docs.
>
> +22: movl 16(%esp), %edx
> + test %edx, %edx
> + jz 8b
> --- a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
> +++ b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
> @@ -103,7 +103,7 @@ __pthread_cond_timedwait:
> mov %RSI_LP, dep_mutex(%rdi)
>
> 22:
> - xorl %r15d, %r15d
> + xorb %r15b, %r15b
>
> #ifndef __ASSUME_FUTEX_CLOCK_REALTIME
> # ifdef PIC
> @@ -190,18 +190,36 @@ __pthread_cond_timedwait:
> movl $SYS_futex, %eax
> syscall
>
> - movl $1, %r15d
> + cmpl $0, %eax
> + sete %r15b
> +
> #ifdef __ASSUME_REQUEUE_PI
> jmp 62f
> #else
Did you test with __ASSUME_REQUEUE_PI defined? Shouldn't this jump be
after the EAGAIN checks? Or is this because you only expect ENOSYS as
only other error code you need to handle here? A comment perhaps?
> - cmpq $-4095, %rax
> - jnae 62f
> + je 62f
> +
> + /* If we raced for the futex with someone else, the syscall
> may return
We don't race for the futex (what does that mean?). I think we should
say "if the futex_wait was interrupted by a change to the futex
variable" or something like that.
> + an EAGAIN. Act as if you did a regular FUTEX_WAIT and
> returned
> + successfully from it. Let the sequence numbers hold you
> back if
> + you were not supposed to be woken up. That way, you don't
> forget
> + to take the mutex lock on your way out.
IMHO, I find that hard to understand. Perhaps be a bit more verbose?
Just pointing out the differences to the non-PI base algorithm might
also be useful.
> We retry using normal
> + futex functions only if the kernel returned ENOSYS, since
> normal
> + and PI futexes don't mix.
> +
> + Note that we don't check for EAGAIN specifically; we assume
> that the
> + only other error the futex function could return is EAGAIN
> (barring
> + the ETIMEOUT of course, for the timeout case in futex)
> since
> + anything else would mean an error in our function. It is
> too
> + expensive to do that check for every call (which is quite
> common in
> + case of a large number of threads), so it has been skipped.
> */
I would be surprised if a single branch after a syscall would be a
measurable performance degradation. Not sure what the policy for this
is, but I'd prefer an abort or such if we get a return value we didn't
expect. Also, if we get frequent spurious wake-ups, then this branch is
the smallest part of the overhead.