This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 5/6][BZ #11588] x86_64: Remove assembly implementations for pthread_cond_*
- From: Darren Hart <dvhart at linux dot intel dot com>
- To: <gratian dot crisan at ni dot com>, <libc-alpha at sourceware dot org>
- Cc: Carlos O'Donell <carlos at redhat dot com>, Joseph Myers <joseph at codesourcery dot com>, Jeff Law <law at redhat dot com>, Scot Salmon <scot dot salmon at ni dot com>, Siddhesh Poyarekar <spoyarek at redhat dot com>, Thomas Gleixner <tglx at linutronix dot de>, Torvald Riegel <triegel at redhat dot com>, Clark Williams <williams at redhat dot com>, "Paul E. McKenney" <paulmck at linux dot vnet dot ibm dot com>, Will Newton <will dot newton at linaro dot org>, <gratian at gmail dot com>
- Date: Wed, 30 Jul 2014 20:58:03 -0700
- Subject: Re: [PATCH 5/6][BZ #11588] x86_64: Remove assembly implementations for pthread_cond_*
- Authentication-results: sourceware.org; auth=none
- References: <OF6ABEE614 dot FAE80AD2-ON86257D0E dot 006B38F4-86257D0E dot 0070034A at ni dot com> <1406680317-20189-1-git-send-email-gratian dot crisan at ni dot com> <1406680317-20189-6-git-send-email-gratian dot crisan at ni dot com>
On 7/29/14, 17:31, "gratian.crisan@ni.com" <gratian.crisan@ni.com> wrote:
>From: Gratian Crisan <gratian.crisan@ni.com>
>
>Switch x86_64 from using assembly implementations for pthread_cond_signal,
>pthread_cond_broadcast, pthread_cond_wait, and pthread_cond_timedwait to
>using the generic C implementation. Based on benchmarks results (see
>below)
>the C implementation is comparable in performance, easier to maintain,
>less
>bug prone, and supports priority inheritance for associated mutexes.
>Note: the bench-pthread_cond output was edited to fit within 80 columns by
>removing some white space and the 'variance' column.
The Atom tests in particular seem to vary *greatly* between the C and ASM
implementations. A 3825 is a Baytrail dual core (silvermont core) I
believe, which I would have expected some better performance from, with
fewer bubbles in the instruction pipeline, etc. Perhaps the compiler now
does a better job at this than the hand written asm in this case.
I would *love* to see the ASM go away though - thanks for including this.
--
Darren
>
>C implementation, quad core Intel(R) Xeon(R) CPU E5-1620 @3.60GHz, gcc
>4.7.3
>pthread_cond_[test] iter/threads mean min max std.
>dev
>--------------------------------------------------------------------------
>--
>signal (w/o waiters) 1000000/100 93.002 57 6519657 2679.6
>broadcast (w/o waiters) 1000000/100 96.6929 57 10231506
>2996.06
>signal 1000000/1 2833.97 532 92328
>1348.39
>broadcast 1000000/1 3317.85 704 172804
>1108.65
>signal/wait 100000/100 7726.83 3388 23269308
>22286.5
>signal/timedwait 100000/100 8148.47 3888 23172368
>18712.9
>broadcast/wait 100000/100 7895.33 3888 14886020
>14894.2
>broadcast/timedwait 100000/100 8362.07 3924 18439204
>19950.1
>
>Assembly implementation, quad core, Intel(R) Xeon(R) CPU E5-1620 @ 3.60GHz
>pthread_cond_[test] iter/threads mean min max std.
>dev
>--------------------------------------------------------------------------
>--
>signal (w/o waiters) 1000000/100 94.1301 57 69489528
>8016.01
>broadcast (w/o waiters) 1000000/100 104.562 57 300175497
>39393.4
>signal 1000000/1 2868.11 510 157149
>1363.98
>broadcast 1000000/1 3057.23 688 180376
>1192.49
>signal/wait 100000/100 7676.12 3340 24017028
>20393.1
>signal/timedwait 100000/100 8157.42 3856 28700448 22368
>broadcast/wait 100000/100 7871.86 3648 27913676
>21203.7
>broadcast/timedwait 100000/100 8300.47 4188 27813444
>24769.8
>
>C implementation, dual core Intel(R) Atom(TM) CPU E3825 @ 1.33GHz, gcc
>4.7.3
>pthread_cond_[test] iter/threads mean min max std.
>dev
>--------------------------------------------------------------------------
>--
>signal (w/o waiters) 1000000/100 95.077 90 28960
>33.3326
>broadcast (w/o waiters) 1000000/100 114.874 90 13820
>78.6426
>signal 1000000/1 6704.17 3510 49390
>3537.21
>broadcast 1000000/1 6726.35 3850 55430
>3297.21
>signal/wait 100000/100 16888.2 12240 6682020
>15045.4
>signal/timedwait 100000/100 19246.6 13560 6874950
>15969.5
>broadcast/wait 100000/100 17228.5 12390 6461480
>14780.2
>broadcast/timedwait 100000/100 19414.5 13910 6656950
>15681.8
>
>Assembly implementation, dual core Intel(R) Atom(TM) CPU E3825 @ 1.33GHz
>pthread_cond_[test] iter/threads mean min max std.
>dev
>--------------------------------------------------------------------------
>--
>signal (w/o waiters) 1000000/100 263.81 70 120171680 90138
>broadcast (w/o waiters) 1000000/100 264.213 70 160178010
>91861.4
>signal 1000000/1 15851.7 3800 13372770 13889
>broadcast 1000000/1 16095.2 5900 14940170
>16346.7
>signal/wait 100000/100 33151 7930 252746080 475402
>signal/timedwait 100000/100 34921.1 10950 147023040 270191
>broadcast/wait 100000/100 33400.2 11810 247194720 455105
>broadcast/timedwait 100000/100 35022.1 13610 161552720 30328
>
>Signed-off-by: Gratian Crisan <gratian.crisan@ni.com>
>
>--
>Changes since v1:
>* Re-run tests on "Intel(R) Xeon(R) CPU E5-1620 @ 3.60GHz" due to
>inadvertently using a debug version of the benchmark before.
>
>ChangeLog:
>
>2014-07-29 Gratian Crisan <gratian.crisan@ni.com>
>
> [BZ #11588]
> * sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S: Remove file.
> * sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S: Remove file.
> * sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: Remove file.
> * sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: Remove file.
>
>---
> .../sysv/linux/x86_64/pthread_cond_broadcast.S | 179 -----
> .../unix/sysv/linux/x86_64/pthread_cond_signal.S | 164 ----
> .../sysv/linux/x86_64/pthread_cond_timedwait.S | 840
>---------------------
> sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S | 555 --------------
> 4 files changed, 1738 deletions(-)
> delete mode 100644
>sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S
> delete mode 100644 sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S
> delete mode 100644
>sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
> delete mode 100644 sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
>
>diff --git a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S
>b/sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S
>deleted file mode 100644
>index 985e0f1..0000000
>--- a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S
>+++ /dev/null
>@@ -1,179 +0,0 @@
>-/* Copyright (C) 2002-2014 Free Software Foundation, Inc.
>- This file is part of the GNU C Library.
>- Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
>-
>- The GNU C Library is free software; you can redistribute it and/or
>- modify it under the terms of the GNU Lesser General Public
>- License as published by the Free Software Foundation; either
>- version 2.1 of the License, or (at your option) any later version.
>-
>- The GNU C Library is distributed in the hope that it will be useful,
>- but WITHOUT ANY WARRANTY; without even the implied warranty of
>- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>- Lesser General Public License for more details.
>-
>- You should have received a copy of the GNU Lesser General Public
>- License along with the GNU C Library; if not, see
>- <http://www.gnu.org/licenses/>. */
>-
>-#include <sysdep.h>
>-#include <shlib-compat.h>
>-#include <lowlevellock.h>
>-#include <lowlevelcond.h>
>-#include <kernel-features.h>
>-#include <pthread-pi-defines.h>
>-#include <pthread-errnos.h>
>-#include <stap-probe.h>
>-
>- .text
>-
>- /* int pthread_cond_broadcast (pthread_cond_t *cond) */
>- .globl __pthread_cond_broadcast
>- .type __pthread_cond_broadcast, @function
>- .align 16
>-__pthread_cond_broadcast:
>-
>- LIBC_PROBE (cond_broadcast, 1, %rdi)
>-
>- /* Get internal lock. */
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jnz 1f
>-
>-2: addq $cond_futex, %rdi
>- movq total_seq-cond_futex(%rdi), %r9
>- cmpq wakeup_seq-cond_futex(%rdi), %r9
>- jna 4f
>-
>- /* Cause all currently waiting threads to recognize they are
>- woken up. */
>- movq %r9, wakeup_seq-cond_futex(%rdi)
>- movq %r9, woken_seq-cond_futex(%rdi)
>- addq %r9, %r9
>- movl %r9d, (%rdi)
>- incl broadcast_seq-cond_futex(%rdi)
>-
>- /* Get the address of the mutex used. */
>- mov dep_mutex-cond_futex(%rdi), %R8_LP
>-
>- /* Unlock. */
>- LOCK
>- decl cond_lock-cond_futex(%rdi)
>- jne 7f
>-
>-8: cmp $-1, %R8_LP
>- je 9f
>-
>- /* Do not use requeue for pshared condvars. */
>- testl $PS_BIT, MUTEX_KIND(%r8)
>- jne 9f
>-
>- /* Requeue to a PI mutex if the PI bit is set. */
>- movl MUTEX_KIND(%r8), %eax
>- andl $(ROBUST_BIT|PI_BIT), %eax
>- cmpl $PI_BIT, %eax
>- je 81f
>-
>- /* Wake up all threads. */
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $(FUTEX_CMP_REQUEUE|FUTEX_PRIVATE_FLAG), %esi
>-#else
>- movl %fs:PRIVATE_FUTEX, %esi
>- orl $FUTEX_CMP_REQUEUE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- movl $1, %edx
>- movl $0x7fffffff, %r10d
>- syscall
>-
>- /* For any kind of error, which mainly is EAGAIN, we try again
>- with WAKE. The general test also covers running on old
>- kernels. */
>- cmpq $-4095, %rax
>- jae 9f
>-
>-10: xorl %eax, %eax
>- retq
>-
>- /* Wake up all threads. */
>-81: movl $(FUTEX_CMP_REQUEUE_PI|FUTEX_PRIVATE_FLAG), %esi
>- movl $SYS_futex, %eax
>- movl $1, %edx
>- movl $0x7fffffff, %r10d
>- syscall
>-
>- /* For any kind of error, which mainly is EAGAIN, we try again
>- with WAKE. The general test also covers running on old
>- kernels. */
>- cmpq $-4095, %rax
>- jb 10b
>- jmp 9f
>-
>- .align 16
>- /* Unlock. */
>-4: LOCK
>- decl cond_lock-cond_futex(%rdi)
>- jne 5f
>-
>-6: xorl %eax, %eax
>- retq
>-
>- /* Initial locking failed. */
>-1:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>- jmp 2b
>-
>- /* Unlock in loop requires wakeup. */
>-5: addq $cond_lock-cond_futex, %rdi
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- jmp 6b
>-
>- /* Unlock in loop requires wakeup. */
>-7: addq $cond_lock-cond_futex, %rdi
>- cmp $-1, %R8_LP
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- subq $cond_lock-cond_futex, %rdi
>- jmp 8b
>-
>-9: /* The futex requeue functionality is not available. */
>- cmp $-1, %R8_LP
>- movl $0x7fffffff, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>- jmp 10b
>- .size __pthread_cond_broadcast, .-__pthread_cond_broadcast
>-versioned_symbol (libpthread, __pthread_cond_broadcast,
>pthread_cond_broadcast,
>- GLIBC_2_3_2)
>diff --git a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S
>b/sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S
>deleted file mode 100644
>index 53d65b6..0000000
>--- a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S
>+++ /dev/null
>@@ -1,164 +0,0 @@
>-/* Copyright (C) 2002-2014 Free Software Foundation, Inc.
>- This file is part of the GNU C Library.
>- Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
>-
>- The GNU C Library is free software; you can redistribute it and/or
>- modify it under the terms of the GNU Lesser General Public
>- License as published by the Free Software Foundation; either
>- version 2.1 of the License, or (at your option) any later version.
>-
>- The GNU C Library is distributed in the hope that it will be useful,
>- but WITHOUT ANY WARRANTY; without even the implied warranty of
>- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>- Lesser General Public License for more details.
>-
>- You should have received a copy of the GNU Lesser General Public
>- License along with the GNU C Library; if not, see
>- <http://www.gnu.org/licenses/>. */
>-
>-#include <sysdep.h>
>-#include <shlib-compat.h>
>-#include <lowlevellock.h>
>-#include <lowlevelcond.h>
>-#include <pthread-pi-defines.h>
>-#include <kernel-features.h>
>-#include <pthread-errnos.h>
>-#include <stap-probe.h>
>-
>-
>- .text
>-
>- /* int pthread_cond_signal (pthread_cond_t *cond) */
>- .globl __pthread_cond_signal
>- .type __pthread_cond_signal, @function
>- .align 16
>-__pthread_cond_signal:
>-
>- LIBC_PROBE (cond_signal, 1, %rdi)
>-
>- /* Get internal lock. */
>- movq %rdi, %r8
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jnz 1f
>-
>-2: addq $cond_futex, %rdi
>- movq total_seq(%r8), %rcx
>- cmpq wakeup_seq(%r8), %rcx
>- jbe 4f
>-
>- /* Bump the wakeup number. */
>- addq $1, wakeup_seq(%r8)
>- addl $1, (%rdi)
>-
>- /* Wake up one thread. */
>- LP_OP(cmp) $-1, dep_mutex(%r8)
>- movl $FUTEX_WAKE_OP, %esi
>- movl $1, %edx
>- movl $SYS_futex, %eax
>- je 8f
>-
>- /* Get the address of the mutex used. */
>- mov dep_mutex(%r8), %RCX_LP
>- movl MUTEX_KIND(%rcx), %r11d
>- andl $(ROBUST_BIT|PI_BIT), %r11d
>- cmpl $PI_BIT, %r11d
>- je 9f
>-
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $(FUTEX_WAKE_OP|FUTEX_PRIVATE_FLAG), %esi
>-#else
>- orl %fs:PRIVATE_FUTEX, %esi
>-#endif
>-
>-8: movl $1, %r10d
>-#if cond_lock != 0
>- addq $cond_lock, %r8
>-#endif
>- movl $FUTEX_OP_CLEAR_WAKE_IF_GT_ONE, %r9d
>- syscall
>-#if cond_lock != 0
>- subq $cond_lock, %r8
>-#endif
>- /* For any kind of error, we try again with WAKE.
>- The general test also covers running on old kernels. */
>- cmpq $-4095, %rax
>- jae 7f
>-
>- xorl %eax, %eax
>- retq
>-
>- /* Wake up one thread and requeue none in the PI Mutex case. */
>-9: movl $(FUTEX_CMP_REQUEUE_PI|FUTEX_PRIVATE_FLAG), %esi
>- movq %rcx, %r8
>- xorq %r10, %r10
>- movl (%rdi), %r9d // XXX Can this be right?
>- syscall
>-
>- leaq -cond_futex(%rdi), %r8
>-
>- /* For any kind of error, we try again with WAKE.
>- The general test also covers running on old kernels. */
>- cmpq $-4095, %rax
>- jb 4f
>-
>-7:
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- andl $FUTEX_PRIVATE_FLAG, %esi
>-#else
>- andl %fs:PRIVATE_FUTEX, %esi
>-#endif
>- orl $FUTEX_WAKE, %esi
>- movl $SYS_futex, %eax
>- /* %rdx should be 1 already from $FUTEX_WAKE_OP syscall.
>- movl $1, %edx */
>- syscall
>-
>- /* Unlock. */
>-4: LOCK
>-#if cond_lock == 0
>- decl (%r8)
>-#else
>- decl cond_lock(%r8)
>-#endif
>- jne 5f
>-
>-6: xorl %eax, %eax
>- retq
>-
>- /* Initial locking failed. */
>-1:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>- jmp 2b
>-
>- /* Unlock in loop requires wakeup. */
>-5:
>- movq %r8, %rdi
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- jmp 6b
>- .size __pthread_cond_signal, .-__pthread_cond_signal
>-versioned_symbol (libpthread, __pthread_cond_signal, pthread_cond_signal,
>- GLIBC_2_3_2)
>diff --git a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
>b/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
>deleted file mode 100644
>index 0dc2340..0000000
>--- a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S
>+++ /dev/null
>@@ -1,840 +0,0 @@
>-/* Copyright (C) 2002-2014 Free Software Foundation, Inc.
>- This file is part of the GNU C Library.
>- Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
>-
>- The GNU C Library is free software; you can redistribute it and/or
>- modify it under the terms of the GNU Lesser General Public
>- License as published by the Free Software Foundation; either
>- version 2.1 of the License, or (at your option) any later version.
>-
>- The GNU C Library is distributed in the hope that it will be useful,
>- but WITHOUT ANY WARRANTY; without even the implied warranty of
>- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>- Lesser General Public License for more details.
>-
>- You should have received a copy of the GNU Lesser General Public
>- License along with the GNU C Library; if not, see
>- <http://www.gnu.org/licenses/>. */
>-
>-#include <sysdep.h>
>-#include <shlib-compat.h>
>-#include <lowlevellock.h>
>-#include <lowlevelcond.h>
>-#include <pthread-pi-defines.h>
>-#include <pthread-errnos.h>
>-#include <stap-probe.h>
>-
>-#include <kernel-features.h>
>-
>-
>- .text
>-
>-
>-/* int pthread_cond_timedwait (pthread_cond_t *cond, pthread_mutex_t
>*mutex,
>- const struct timespec *abstime) */
>- .globl __pthread_cond_timedwait
>- .type __pthread_cond_timedwait, @function
>- .align 16
>-__pthread_cond_timedwait:
>-.LSTARTCODE:
>- cfi_startproc
>-#ifdef SHARED
>- cfi_personality(DW_EH_PE_pcrel | DW_EH_PE_sdata4 | DW_EH_PE_indirect,
>- DW.ref.__gcc_personality_v0)
>- cfi_lsda(DW_EH_PE_pcrel | DW_EH_PE_sdata4, .LexceptSTART)
>-#else
>- cfi_personality(DW_EH_PE_udata4, __gcc_personality_v0)
>- cfi_lsda(DW_EH_PE_udata4, .LexceptSTART)
>-#endif
>-
>- pushq %r12
>- cfi_adjust_cfa_offset(8)
>- cfi_rel_offset(%r12, 0)
>- pushq %r13
>- cfi_adjust_cfa_offset(8)
>- cfi_rel_offset(%r13, 0)
>- pushq %r14
>- cfi_adjust_cfa_offset(8)
>- cfi_rel_offset(%r14, 0)
>- pushq %r15
>- cfi_adjust_cfa_offset(8)
>- cfi_rel_offset(%r15, 0)
>-#ifdef __ASSUME_FUTEX_CLOCK_REALTIME
>-# define FRAME_SIZE (32+8)
>-#else
>-# define FRAME_SIZE (48+8)
>-#endif
>- subq $FRAME_SIZE, %rsp
>- cfi_adjust_cfa_offset(FRAME_SIZE)
>- cfi_remember_state
>-
>- LIBC_PROBE (cond_timedwait, 3, %rdi, %rsi, %rdx)
>-
>- cmpq $1000000000, 8(%rdx)
>- movl $EINVAL, %eax
>- jae 48f
>-
>- /* Stack frame:
>-
>- rsp + 48
>- +--------------------------+
>- rsp + 32 | timeout value |
>- +--------------------------+
>- rsp + 24 | old wake_seq value |
>- +--------------------------+
>- rsp + 16 | mutex pointer |
>- +--------------------------+
>- rsp + 8 | condvar pointer |
>- +--------------------------+
>- rsp + 4 | old broadcast_seq value |
>- +--------------------------+
>- rsp + 0 | old cancellation mode |
>- +--------------------------+
>- */
>-
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>-
>- /* Prepare structure passed to cancellation handler. */
>- movq %rdi, 8(%rsp)
>- movq %rsi, 16(%rsp)
>- movq %rdx, %r13
>-
>- je 22f
>- mov %RSI_LP, dep_mutex(%rdi)
>-
>-22:
>- xorb %r15b, %r15b
>-
>-#ifndef __ASSUME_FUTEX_CLOCK_REALTIME
>-# ifdef PIC
>- cmpl $0, __have_futex_clock_realtime(%rip)
>-# else
>- cmpl $0, __have_futex_clock_realtime
>-# endif
>- je .Lreltmo
>-#endif
>-
>- /* Get internal lock. */
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jnz 31f
>-
>- /* Unlock the mutex. */
>-32: movq 16(%rsp), %rdi
>- xorl %esi, %esi
>- callq __pthread_mutex_unlock_usercnt
>-
>- testl %eax, %eax
>- jne 46f
>-
>- movq 8(%rsp), %rdi
>- incq total_seq(%rdi)
>- incl cond_futex(%rdi)
>- addl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Get and store current wakeup_seq value. */
>- movq 8(%rsp), %rdi
>- movq wakeup_seq(%rdi), %r9
>- movl broadcast_seq(%rdi), %edx
>- movq %r9, 24(%rsp)
>- movl %edx, 4(%rsp)
>-
>- cmpq $0, (%r13)
>- movq $-ETIMEDOUT, %r14
>- js 36f
>-
>-38: movl cond_futex(%rdi), %r12d
>-
>- /* Unlock. */
>- LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- jne 33f
>-
>-.LcleanupSTART1:
>-34: callq __pthread_enable_asynccancel
>- movl %eax, (%rsp)
>-
>- movq %r13, %r10
>- movl $FUTEX_WAIT_BITSET, %esi
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>- je 60f
>-
>- mov dep_mutex(%rdi), %R8_LP
>- /* Requeue to a non-robust PI mutex if the PI bit is set and
>- the robust bit is not set. */
>- movl MUTEX_KIND(%r8), %eax
>- andl $(ROBUST_BIT|PI_BIT), %eax
>- cmpl $PI_BIT, %eax
>- jne 61f
>-
>- movl $(FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG), %esi
>- xorl %eax, %eax
>- /* The following only works like this because we only support
>- two clocks, represented using a single bit. */
>- testl $1, cond_nwaiters(%rdi)
>- movl $FUTEX_CLOCK_REALTIME, %edx
>- cmove %edx, %eax
>- orl %eax, %esi
>- movq %r12, %rdx
>- addq $cond_futex, %rdi
>- movl $SYS_futex, %eax
>- syscall
>-
>- cmpl $0, %eax
>- sete %r15b
>-
>-#ifdef __ASSUME_REQUEUE_PI
>- jmp 62f
>-#else
>- je 62f
>-
>- /* When a futex syscall with FUTEX_WAIT_REQUEUE_PI returns
>- successfully, it has already locked the mutex for us and the
>- pi_flag (%r15b) is set to denote that fact. However, if another
>- thread changed the futex value before we entered the wait, the
>- syscall may return an EAGAIN and the mutex is not locked. We go
>- ahead with a success anyway since later we look at the pi_flag to
>- decide if we got the mutex or not. The sequence numbers then make
>- sure that only one of the threads actually wake up. We retry using
>- normal FUTEX_WAIT only if the kernel returned ENOSYS, since normal
>- and PI futexes don't mix.
>-
>- Note that we don't check for EAGAIN specifically; we assume that the
>- only other error the futex function could return is EAGAIN (barring
>- the ETIMEOUT of course, for the timeout case in futex) since
>- anything else would mean an error in our function. It is too
>- expensive to do that check for every call (which is quite common in
>- case of a large number of threads), so it has been skipped. */
>- cmpl $-ENOSYS, %eax
>- jne 62f
>-
>- subq $cond_futex, %rdi
>-#endif
>-
>-61: movl $(FUTEX_WAIT_BITSET|FUTEX_PRIVATE_FLAG), %esi
>-60: xorb %r15b, %r15b
>- xorl %eax, %eax
>- /* The following only works like this because we only support
>- two clocks, represented using a single bit. */
>- testl $1, cond_nwaiters(%rdi)
>- movl $FUTEX_CLOCK_REALTIME, %edx
>- movl $0xffffffff, %r9d
>- cmove %edx, %eax
>- orl %eax, %esi
>- movq %r12, %rdx
>- addq $cond_futex, %rdi
>- movl $SYS_futex, %eax
>- syscall
>-62: movq %rax, %r14
>-
>- movl (%rsp), %edi
>- callq __pthread_disable_asynccancel
>-.LcleanupEND1:
>-
>- /* Lock. */
>- movq 8(%rsp), %rdi
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jne 35f
>-
>-36: movl broadcast_seq(%rdi), %edx
>-
>- movq woken_seq(%rdi), %rax
>-
>- movq wakeup_seq(%rdi), %r9
>-
>- cmpl 4(%rsp), %edx
>- jne 53f
>-
>- cmpq 24(%rsp), %r9
>- jbe 45f
>-
>- cmpq %rax, %r9
>- ja 39f
>-
>-45: cmpq $-ETIMEDOUT, %r14
>- je 99f
>-
>- /* We need to go back to futex_wait. If we're using requeue_pi, then
>- release the mutex we had acquired and go back. */
>- test %r15b, %r15b
>- jz 38b
>-
>- /* Adjust the mutex values first and then unlock it. The unlock
>- should always succeed or else the kernel did not lock the
>- mutex correctly. */
>- movq %r8, %rdi
>- callq __pthread_mutex_cond_lock_adjust
>- xorl %esi, %esi
>- callq __pthread_mutex_unlock_usercnt
>- /* Reload cond_var. */
>- movq 8(%rsp), %rdi
>- jmp 38b
>-
>-99: incq wakeup_seq(%rdi)
>- incl cond_futex(%rdi)
>- movl $ETIMEDOUT, %r14d
>- jmp 44f
>-
>-53: xorq %r14, %r14
>- jmp 54f
>-
>-39: xorq %r14, %r14
>-44: incq woken_seq(%rdi)
>-
>-54: subl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Wake up a thread which wants to destroy the condvar object. */
>- cmpq $0xffffffffffffffff, total_seq(%rdi)
>- jne 55f
>- movl cond_nwaiters(%rdi), %eax
>- andl $~((1 << nwaiters_shift) - 1), %eax
>- jne 55f
>-
>- addq $cond_nwaiters, %rdi
>- LP_OP(cmp) $-1, dep_mutex-cond_nwaiters(%rdi)
>- movl $1, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>- subq $cond_nwaiters, %rdi
>-
>-55: LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- jne 40f
>-
>- /* If requeue_pi is used the kernel performs the locking of the
>- mutex. */
>-41: movq 16(%rsp), %rdi
>- testb %r15b, %r15b
>- jnz 64f
>-
>- callq __pthread_mutex_cond_lock
>-
>-63: testq %rax, %rax
>- cmoveq %r14, %rax
>-
>-48: addq $FRAME_SIZE, %rsp
>- cfi_adjust_cfa_offset(-FRAME_SIZE)
>- popq %r15
>- cfi_adjust_cfa_offset(-8)
>- cfi_restore(%r15)
>- popq %r14
>- cfi_adjust_cfa_offset(-8)
>- cfi_restore(%r14)
>- popq %r13
>- cfi_adjust_cfa_offset(-8)
>- cfi_restore(%r13)
>- popq %r12
>- cfi_adjust_cfa_offset(-8)
>- cfi_restore(%r12)
>-
>- retq
>-
>- cfi_restore_state
>-
>-64: callq __pthread_mutex_cond_lock_adjust
>- movq %r14, %rax
>- jmp 48b
>-
>- /* Initial locking failed. */
>-31:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>- jmp 32b
>-
>- /* Unlock in loop requires wakeup. */
>-33:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- jmp 34b
>-
>- /* Locking in loop failed. */
>-35:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>- jmp 36b
>-
>- /* Unlock after loop requires wakeup. */
>-40:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- jmp 41b
>-
>- /* The initial unlocking of the mutex failed. */
>-46: movq 8(%rsp), %rdi
>- movq %rax, (%rsp)
>- LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- jne 47f
>-
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>-
>-47: movq (%rsp), %rax
>- jmp 48b
>-
>-
>-#ifndef __ASSUME_FUTEX_CLOCK_REALTIME
>-.Lreltmo:
>- /* Get internal lock. */
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-# if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-# else
>- cmpxchgl %esi, cond_lock(%rdi)
>-# endif
>- jnz 1f
>-
>- /* Unlock the mutex. */
>-2: movq 16(%rsp), %rdi
>- xorl %esi, %esi
>- callq __pthread_mutex_unlock_usercnt
>-
>- testl %eax, %eax
>- jne 46b
>-
>- movq 8(%rsp), %rdi
>- incq total_seq(%rdi)
>- incl cond_futex(%rdi)
>- addl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Get and store current wakeup_seq value. */
>- movq 8(%rsp), %rdi
>- movq wakeup_seq(%rdi), %r9
>- movl broadcast_seq(%rdi), %edx
>- movq %r9, 24(%rsp)
>- movl %edx, 4(%rsp)
>-
>- /* Get the current time. */
>-8:
>-# ifdef __NR_clock_gettime
>- /* Get the clock number. Note that the field in the condvar
>- structure stores the number minus 1. */
>- movq 8(%rsp), %rdi
>- movl cond_nwaiters(%rdi), %edi
>- andl $((1 << nwaiters_shift) - 1), %edi
>- /* Only clocks 0 and 1 are allowed so far. Both are handled in the
>- kernel. */
>- leaq 32(%rsp), %rsi
>-# ifdef SHARED
>- mov __vdso_clock_gettime@GOTPCREL(%rip), %RAX_LP
>- mov (%rax), %RAX_LP
>- PTR_DEMANGLE (%RAX_LP)
>- call *%rax
>-# else
>- movl $__NR_clock_gettime, %eax
>- syscall
>-# endif
>-
>- /* Compute relative timeout. */
>- movq (%r13), %rcx
>- movq 8(%r13), %rdx
>- subq 32(%rsp), %rcx
>- subq 40(%rsp), %rdx
>-# else
>- leaq 24(%rsp), %rdi
>- xorl %esi, %esi
>- /* This call works because we directly jump to a system call entry
>- which preserves all the registers. */
>- call JUMPTARGET(__gettimeofday)
>-
>- /* Compute relative timeout. */
>- movq 40(%rsp), %rax
>- movl $1000, %edx
>- mul %rdx /* Milli seconds to nano seconds. */
>- movq (%r13), %rcx
>- movq 8(%r13), %rdx
>- subq 32(%rsp), %rcx
>- subq %rax, %rdx
>-# endif
>- jns 12f
>- addq $1000000000, %rdx
>- decq %rcx
>-12: testq %rcx, %rcx
>- movq 8(%rsp), %rdi
>- movq $-ETIMEDOUT, %r14
>- js 6f
>-
>- /* Store relative timeout. */
>-21: movq %rcx, 32(%rsp)
>- movq %rdx, 40(%rsp)
>-
>- movl cond_futex(%rdi), %r12d
>-
>- /* Unlock. */
>- LOCK
>-# if cond_lock == 0
>- decl (%rdi)
>-# else
>- decl cond_lock(%rdi)
>-# endif
>- jne 3f
>-
>-.LcleanupSTART2:
>-4: callq __pthread_enable_asynccancel
>- movl %eax, (%rsp)
>-
>- leaq 32(%rsp), %r10
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>- movq %r12, %rdx
>-# ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAIT, %eax
>- movl $(FUTEX_WAIT|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-# else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>-# if FUTEX_WAIT != 0
>- orl $FUTEX_WAIT, %esi
>-# endif
>-# endif
>- addq $cond_futex, %rdi
>- movl $SYS_futex, %eax
>- syscall
>- movq %rax, %r14
>-
>- movl (%rsp), %edi
>- callq __pthread_disable_asynccancel
>-.LcleanupEND2:
>-
>- /* Lock. */
>- movq 8(%rsp), %rdi
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-# if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-# else
>- cmpxchgl %esi, cond_lock(%rdi)
>-# endif
>- jne 5f
>-
>-6: movl broadcast_seq(%rdi), %edx
>-
>- movq woken_seq(%rdi), %rax
>-
>- movq wakeup_seq(%rdi), %r9
>-
>- cmpl 4(%rsp), %edx
>- jne 53b
>-
>- cmpq 24(%rsp), %r9
>- jbe 15f
>-
>- cmpq %rax, %r9
>- ja 39b
>-
>-15: cmpq $-ETIMEDOUT, %r14
>- jne 8b
>-
>- jmp 99b
>-
>- /* Initial locking failed. */
>-1:
>-# if cond_lock != 0
>- addq $cond_lock, %rdi
>-# endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>- jmp 2b
>-
>- /* Unlock in loop requires wakeup. */
>-3:
>-# if cond_lock != 0
>- addq $cond_lock, %rdi
>-# endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- jmp 4b
>-
>- /* Locking in loop failed. */
>-5:
>-# if cond_lock != 0
>- addq $cond_lock, %rdi
>-# endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-# if cond_lock != 0
>- subq $cond_lock, %rdi
>-# endif
>- jmp 6b
>-#endif
>- .size __pthread_cond_timedwait, .-__pthread_cond_timedwait
>-versioned_symbol (libpthread, __pthread_cond_timedwait,
>pthread_cond_timedwait,
>- GLIBC_2_3_2)
>-
>-
>- .align 16
>- .type __condvar_cleanup2, @function
>-__condvar_cleanup2:
>- /* Stack frame:
>-
>- rsp + 72
>- +--------------------------+
>- rsp + 64 | %r12 |
>- +--------------------------+
>- rsp + 56 | %r13 |
>- +--------------------------+
>- rsp + 48 | %r14 |
>- +--------------------------+
>- rsp + 24 | unused |
>- +--------------------------+
>- rsp + 16 | mutex pointer |
>- +--------------------------+
>- rsp + 8 | condvar pointer |
>- +--------------------------+
>- rsp + 4 | old broadcast_seq value |
>- +--------------------------+
>- rsp + 0 | old cancellation mode |
>- +--------------------------+
>- */
>-
>- movq %rax, 24(%rsp)
>-
>- /* Get internal lock. */
>- movq 8(%rsp), %rdi
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jz 1f
>-
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>-
>-1: movl broadcast_seq(%rdi), %edx
>- cmpl 4(%rsp), %edx
>- jne 3f
>-
>- /* We increment the wakeup_seq counter only if it is lower than
>- total_seq. If this is not the case the thread was woken and
>- then canceled. In this case we ignore the signal. */
>- movq total_seq(%rdi), %rax
>- cmpq wakeup_seq(%rdi), %rax
>- jbe 6f
>- incq wakeup_seq(%rdi)
>- incl cond_futex(%rdi)
>-6: incq woken_seq(%rdi)
>-
>-3: subl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Wake up a thread which wants to destroy the condvar object. */
>- xorq %r12, %r12
>- cmpq $0xffffffffffffffff, total_seq(%rdi)
>- jne 4f
>- movl cond_nwaiters(%rdi), %eax
>- andl $~((1 << nwaiters_shift) - 1), %eax
>- jne 4f
>-
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>- leaq cond_nwaiters(%rdi), %rdi
>- movl $1, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>- subq $cond_nwaiters, %rdi
>- movl $1, %r12d
>-
>-4: LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- je 2f
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>-
>- /* Wake up all waiters to make sure no signal gets lost. */
>-2: testq %r12, %r12
>- jnz 5f
>- addq $cond_futex, %rdi
>- LP_OP(cmp) $-1, dep_mutex-cond_futex(%rdi)
>- movl $0x7fffffff, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>-
>- /* Lock the mutex only if we don't own it already. This only happens
>- in case of PI mutexes, if we got cancelled after a successful
>- return of the futex syscall and before disabling async
>- cancellation. */
>-5: movq 16(%rsp), %rdi
>- movl MUTEX_KIND(%rdi), %eax
>- andl $(ROBUST_BIT|PI_BIT), %eax
>- cmpl $PI_BIT, %eax
>- jne 7f
>-
>- movl (%rdi), %eax
>- andl $TID_MASK, %eax
>- cmpl %eax, %fs:TID
>- jne 7f
>- /* We managed to get the lock. Fix it up before returning. */
>- callq __pthread_mutex_cond_lock_adjust
>- jmp 8f
>-
>-7: callq __pthread_mutex_cond_lock
>-
>-8: movq 24(%rsp), %rdi
>- movq FRAME_SIZE(%rsp), %r15
>- movq FRAME_SIZE+8(%rsp), %r14
>- movq FRAME_SIZE+16(%rsp), %r13
>- movq FRAME_SIZE+24(%rsp), %r12
>-.LcallUR:
>- call _Unwind_Resume@PLT
>- hlt
>-.LENDCODE:
>- cfi_endproc
>- .size __condvar_cleanup2, .-__condvar_cleanup2
>-
>-
>- .section .gcc_except_table,"a",@progbits
>-.LexceptSTART:
>- .byte DW_EH_PE_omit # @LPStart format
>- .byte DW_EH_PE_omit # @TType format
>- .byte DW_EH_PE_uleb128 # call-site format
>- .uleb128 .Lcstend-.Lcstbegin
>-.Lcstbegin:
>- .uleb128 .LcleanupSTART1-.LSTARTCODE
>- .uleb128 .LcleanupEND1-.LcleanupSTART1
>- .uleb128 __condvar_cleanup2-.LSTARTCODE
>- .uleb128 0
>-#ifndef __ASSUME_FUTEX_CLOCK_REALTIME
>- .uleb128 .LcleanupSTART2-.LSTARTCODE
>- .uleb128 .LcleanupEND2-.LcleanupSTART2
>- .uleb128 __condvar_cleanup2-.LSTARTCODE
>- .uleb128 0
>-#endif
>- .uleb128 .LcallUR-.LSTARTCODE
>- .uleb128 .LENDCODE-.LcallUR
>- .uleb128 0
>- .uleb128 0
>-.Lcstend:
>-
>-
>-#ifdef SHARED
>- .hidden DW.ref.__gcc_personality_v0
>- .weak DW.ref.__gcc_personality_v0
>- .section .gnu.linkonce.d.DW.ref.__gcc_personality_v0,"aw",@progbits
>- .align LP_SIZE
>- .type DW.ref.__gcc_personality_v0, @object
>- .size DW.ref.__gcc_personality_v0, LP_SIZE
>-DW.ref.__gcc_personality_v0:
>- ASM_ADDR __gcc_personality_v0
>-#endif
>diff --git a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
>b/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
>deleted file mode 100644
>index 0e61d0a..0000000
>--- a/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
>+++ /dev/null
>@@ -1,555 +0,0 @@
>-/* Copyright (C) 2002-2014 Free Software Foundation, Inc.
>- This file is part of the GNU C Library.
>- Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
>-
>- The GNU C Library is free software; you can redistribute it and/or
>- modify it under the terms of the GNU Lesser General Public
>- License as published by the Free Software Foundation; either
>- version 2.1 of the License, or (at your option) any later version.
>-
>- The GNU C Library is distributed in the hope that it will be useful,
>- but WITHOUT ANY WARRANTY; without even the implied warranty of
>- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>- Lesser General Public License for more details.
>-
>- You should have received a copy of the GNU Lesser General Public
>- License along with the GNU C Library; if not, see
>- <http://www.gnu.org/licenses/>. */
>-
>-#include <sysdep.h>
>-#include <shlib-compat.h>
>-#include <lowlevellock.h>
>-#include <lowlevelcond.h>
>-#include <tcb-offsets.h>
>-#include <pthread-pi-defines.h>
>-#include <pthread-errnos.h>
>-#include <stap-probe.h>
>-
>-#include <kernel-features.h>
>-
>-
>- .text
>-
>-/* int pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
>*/
>- .globl __pthread_cond_wait
>- .type __pthread_cond_wait, @function
>- .align 16
>-__pthread_cond_wait:
>-.LSTARTCODE:
>- cfi_startproc
>-#ifdef SHARED
>- cfi_personality(DW_EH_PE_pcrel | DW_EH_PE_sdata4 | DW_EH_PE_indirect,
>- DW.ref.__gcc_personality_v0)
>- cfi_lsda(DW_EH_PE_pcrel | DW_EH_PE_sdata4, .LexceptSTART)
>-#else
>- cfi_personality(DW_EH_PE_udata4, __gcc_personality_v0)
>- cfi_lsda(DW_EH_PE_udata4, .LexceptSTART)
>-#endif
>-
>-#define FRAME_SIZE (32+8)
>- leaq -FRAME_SIZE(%rsp), %rsp
>- cfi_adjust_cfa_offset(FRAME_SIZE)
>-
>- /* Stack frame:
>-
>- rsp + 32
>- +--------------------------+
>- rsp + 24 | old wake_seq value |
>- +--------------------------+
>- rsp + 16 | mutex pointer |
>- +--------------------------+
>- rsp + 8 | condvar pointer |
>- +--------------------------+
>- rsp + 4 | old broadcast_seq value |
>- +--------------------------+
>- rsp + 0 | old cancellation mode |
>- +--------------------------+
>- */
>-
>- LIBC_PROBE (cond_wait, 2, %rdi, %rsi)
>-
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>-
>- /* Prepare structure passed to cancellation handler. */
>- movq %rdi, 8(%rsp)
>- movq %rsi, 16(%rsp)
>-
>- je 15f
>- mov %RSI_LP, dep_mutex(%rdi)
>-
>- /* Get internal lock. */
>-15: movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jne 1f
>-
>- /* Unlock the mutex. */
>-2: movq 16(%rsp), %rdi
>- xorl %esi, %esi
>- callq __pthread_mutex_unlock_usercnt
>-
>- testl %eax, %eax
>- jne 12f
>-
>- movq 8(%rsp), %rdi
>- incq total_seq(%rdi)
>- incl cond_futex(%rdi)
>- addl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Get and store current wakeup_seq value. */
>- movq 8(%rsp), %rdi
>- movq wakeup_seq(%rdi), %r9
>- movl broadcast_seq(%rdi), %edx
>- movq %r9, 24(%rsp)
>- movl %edx, 4(%rsp)
>-
>- /* Unlock. */
>-8: movl cond_futex(%rdi), %edx
>- LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- jne 3f
>-
>-.LcleanupSTART:
>-4: callq __pthread_enable_asynccancel
>- movl %eax, (%rsp)
>-
>- xorq %r10, %r10
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>- leaq cond_futex(%rdi), %rdi
>- movl $FUTEX_WAIT, %esi
>- je 60f
>-
>- mov dep_mutex-cond_futex(%rdi), %R8_LP
>- /* Requeue to a non-robust PI mutex if the PI bit is set and
>- the robust bit is not set. */
>- movl MUTEX_KIND(%r8), %eax
>- andl $(ROBUST_BIT|PI_BIT), %eax
>- cmpl $PI_BIT, %eax
>- jne 61f
>-
>- movl $(FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG), %esi
>- movl $SYS_futex, %eax
>- syscall
>-
>- cmpl $0, %eax
>- sete %r8b
>-
>-#ifdef __ASSUME_REQUEUE_PI
>- jmp 62f
>-#else
>- je 62f
>-
>- /* When a futex syscall with FUTEX_WAIT_REQUEUE_PI returns
>- successfully, it has already locked the mutex for us and the
>- pi_flag (%r8b) is set to denote that fact. However, if another
>- thread changed the futex value before we entered the wait, the
>- syscall may return an EAGAIN and the mutex is not locked. We go
>- ahead with a success anyway since later we look at the pi_flag to
>- decide if we got the mutex or not. The sequence numbers then make
>- sure that only one of the threads actually wake up. We retry using
>- normal FUTEX_WAIT only if the kernel returned ENOSYS, since normal
>- and PI futexes don't mix.
>-
>- Note that we don't check for EAGAIN specifically; we assume that the
>- only other error the futex function could return is EAGAIN since
>- anything else would mean an error in our function. It is too
>- expensive to do that check for every call (which is quite common in
>- case of a large number of threads), so it has been skipped. */
>- cmpl $-ENOSYS, %eax
>- jne 62f
>-
>-# ifndef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAIT, %esi
>-# endif
>-#endif
>-
>-61:
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $(FUTEX_WAIT|FUTEX_PRIVATE_FLAG), %esi
>-#else
>- orl %fs:PRIVATE_FUTEX, %esi
>-#endif
>-60: xorb %r8b, %r8b
>- movl $SYS_futex, %eax
>- syscall
>-
>-62: movl (%rsp), %edi
>- callq __pthread_disable_asynccancel
>-.LcleanupEND:
>-
>- /* Lock. */
>- movq 8(%rsp), %rdi
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jnz 5f
>-
>-6: movl broadcast_seq(%rdi), %edx
>-
>- movq woken_seq(%rdi), %rax
>-
>- movq wakeup_seq(%rdi), %r9
>-
>- cmpl 4(%rsp), %edx
>- jne 16f
>-
>- cmpq 24(%rsp), %r9
>- jbe 19f
>-
>- cmpq %rax, %r9
>- jna 19f
>-
>- incq woken_seq(%rdi)
>-
>- /* Unlock */
>-16: subl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Wake up a thread which wants to destroy the condvar object. */
>- cmpq $0xffffffffffffffff, total_seq(%rdi)
>- jne 17f
>- movl cond_nwaiters(%rdi), %eax
>- andl $~((1 << nwaiters_shift) - 1), %eax
>- jne 17f
>-
>- addq $cond_nwaiters, %rdi
>- LP_OP(cmp) $-1, dep_mutex-cond_nwaiters(%rdi)
>- movl $1, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>- subq $cond_nwaiters, %rdi
>-
>-17: LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- jne 10f
>-
>- /* If requeue_pi is used the kernel performs the locking of the
>- mutex. */
>-11: movq 16(%rsp), %rdi
>- testb %r8b, %r8b
>- jnz 18f
>-
>- callq __pthread_mutex_cond_lock
>-
>-14: leaq FRAME_SIZE(%rsp), %rsp
>- cfi_adjust_cfa_offset(-FRAME_SIZE)
>-
>- /* We return the result of the mutex_lock operation. */
>- retq
>-
>- cfi_adjust_cfa_offset(FRAME_SIZE)
>-
>-18: callq __pthread_mutex_cond_lock_adjust
>- xorl %eax, %eax
>- jmp 14b
>-
>- /* We need to go back to futex_wait. If we're using requeue_pi, then
>- release the mutex we had acquired and go back. */
>-19: testb %r8b, %r8b
>- jz 8b
>-
>- /* Adjust the mutex values first and then unlock it. The unlock
>- should always succeed or else the kernel did not lock the mutex
>- correctly. */
>- movq 16(%rsp), %rdi
>- callq __pthread_mutex_cond_lock_adjust
>- movq %rdi, %r8
>- xorl %esi, %esi
>- callq __pthread_mutex_unlock_usercnt
>- /* Reload cond_var. */
>- movq 8(%rsp), %rdi
>- jmp 8b
>-
>- /* Initial locking failed. */
>-1:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>- jmp 2b
>-
>- /* Unlock in loop requires wakeup. */
>-3:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- /* The call preserves %rdx. */
>- callq __lll_unlock_wake
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>- jmp 4b
>-
>- /* Locking in loop failed. */
>-5:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>- jmp 6b
>-
>- /* Unlock after loop requires wakeup. */
>-10:
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>- jmp 11b
>-
>- /* The initial unlocking of the mutex failed. */
>-12: movq %rax, %r10
>- movq 8(%rsp), %rdi
>- LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- je 13f
>-
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_unlock_wake
>-
>-13: movq %r10, %rax
>- jmp 14b
>-
>- .size __pthread_cond_wait, .-__pthread_cond_wait
>-versioned_symbol (libpthread, __pthread_cond_wait, pthread_cond_wait,
>- GLIBC_2_3_2)
>-
>-
>- .align 16
>- .type __condvar_cleanup1, @function
>- .globl __condvar_cleanup1
>- .hidden __condvar_cleanup1
>-__condvar_cleanup1:
>- /* Stack frame:
>-
>- rsp + 32
>- +--------------------------+
>- rsp + 24 | unused |
>- +--------------------------+
>- rsp + 16 | mutex pointer |
>- +--------------------------+
>- rsp + 8 | condvar pointer |
>- +--------------------------+
>- rsp + 4 | old broadcast_seq value |
>- +--------------------------+
>- rsp + 0 | old cancellation mode |
>- +--------------------------+
>- */
>-
>- movq %rax, 24(%rsp)
>-
>- /* Get internal lock. */
>- movq 8(%rsp), %rdi
>- movl $1, %esi
>- xorl %eax, %eax
>- LOCK
>-#if cond_lock == 0
>- cmpxchgl %esi, (%rdi)
>-#else
>- cmpxchgl %esi, cond_lock(%rdi)
>-#endif
>- jz 1f
>-
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- callq __lll_lock_wait
>-#if cond_lock != 0
>- subq $cond_lock, %rdi
>-#endif
>-
>-1: movl broadcast_seq(%rdi), %edx
>- cmpl 4(%rsp), %edx
>- jne 3f
>-
>- /* We increment the wakeup_seq counter only if it is lower than
>- total_seq. If this is not the case the thread was woken and
>- then canceled. In this case we ignore the signal. */
>- movq total_seq(%rdi), %rax
>- cmpq wakeup_seq(%rdi), %rax
>- jbe 6f
>- incq wakeup_seq(%rdi)
>- incl cond_futex(%rdi)
>-6: incq woken_seq(%rdi)
>-
>-3: subl $(1 << nwaiters_shift), cond_nwaiters(%rdi)
>-
>- /* Wake up a thread which wants to destroy the condvar object. */
>- xorl %ecx, %ecx
>- cmpq $0xffffffffffffffff, total_seq(%rdi)
>- jne 4f
>- movl cond_nwaiters(%rdi), %eax
>- andl $~((1 << nwaiters_shift) - 1), %eax
>- jne 4f
>-
>- LP_OP(cmp) $-1, dep_mutex(%rdi)
>- leaq cond_nwaiters(%rdi), %rdi
>- movl $1, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>- subq $cond_nwaiters, %rdi
>- movl $1, %ecx
>-
>-4: LOCK
>-#if cond_lock == 0
>- decl (%rdi)
>-#else
>- decl cond_lock(%rdi)
>-#endif
>- je 2f
>-#if cond_lock != 0
>- addq $cond_lock, %rdi
>-#endif
>- LP_OP(cmp) $-1, dep_mutex-cond_lock(%rdi)
>- movl $LLL_PRIVATE, %eax
>- movl $LLL_SHARED, %esi
>- cmovne %eax, %esi
>- /* The call preserves %rcx. */
>- callq __lll_unlock_wake
>-
>- /* Wake up all waiters to make sure no signal gets lost. */
>-2: testl %ecx, %ecx
>- jnz 5f
>- addq $cond_futex, %rdi
>- LP_OP(cmp) $-1, dep_mutex-cond_futex(%rdi)
>- movl $0x7fffffff, %edx
>-#ifdef __ASSUME_PRIVATE_FUTEX
>- movl $FUTEX_WAKE, %eax
>- movl $(FUTEX_WAKE|FUTEX_PRIVATE_FLAG), %esi
>- cmove %eax, %esi
>-#else
>- movl $0, %eax
>- movl %fs:PRIVATE_FUTEX, %esi
>- cmove %eax, %esi
>- orl $FUTEX_WAKE, %esi
>-#endif
>- movl $SYS_futex, %eax
>- syscall
>-
>- /* Lock the mutex only if we don't own it already. This only happens
>- in case of PI mutexes, if we got cancelled after a successful
>- return of the futex syscall and before disabling async
>- cancellation. */
>-5: movq 16(%rsp), %rdi
>- movl MUTEX_KIND(%rdi), %eax
>- andl $(ROBUST_BIT|PI_BIT), %eax
>- cmpl $PI_BIT, %eax
>- jne 7f
>-
>- movl (%rdi), %eax
>- andl $TID_MASK, %eax
>- cmpl %eax, %fs:TID
>- jne 7f
>- /* We managed to get the lock. Fix it up before returning. */
>- callq __pthread_mutex_cond_lock_adjust
>- jmp 8f
>-
>-
>-7: callq __pthread_mutex_cond_lock
>-
>-8: movq 24(%rsp), %rdi
>-.LcallUR:
>- call _Unwind_Resume@PLT
>- hlt
>-.LENDCODE:
>- cfi_endproc
>- .size __condvar_cleanup1, .-__condvar_cleanup1
>-
>-
>- .section .gcc_except_table,"a",@progbits
>-.LexceptSTART:
>- .byte DW_EH_PE_omit # @LPStart format
>- .byte DW_EH_PE_omit # @TType format
>- .byte DW_EH_PE_uleb128 # call-site format
>- .uleb128 .Lcstend-.Lcstbegin
>-.Lcstbegin:
>- .uleb128 .LcleanupSTART-.LSTARTCODE
>- .uleb128 .LcleanupEND-.LcleanupSTART
>- .uleb128 __condvar_cleanup1-.LSTARTCODE
>- .uleb128 0
>- .uleb128 .LcallUR-.LSTARTCODE
>- .uleb128 .LENDCODE-.LcallUR
>- .uleb128 0
>- .uleb128 0
>-.Lcstend:
>-
>-
>-#ifdef SHARED
>- .hidden DW.ref.__gcc_personality_v0
>- .weak DW.ref.__gcc_personality_v0
>- .section .gnu.linkonce.d.DW.ref.__gcc_personality_v0,"aw",@progbits
>- .align LP_SIZE
>- .type DW.ref.__gcc_personality_v0, @object
>- .size DW.ref.__gcc_personality_v0, LP_SIZE
>-DW.ref.__gcc_personality_v0:
>- ASM_ADDR __gcc_personality_v0
>-#endif
>--
>1.8.5.2
>
>
--
Darren Hart Open Source Technology Center
darren.hart@intel.com Intel Corporation