This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/4209] Performance issue: NPTL semaphores work slower than linuxthreads semaphores
- From: "bart dot vanassche at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sources dot redhat dot com
- Date: 18 Mar 2007 09:16:49 -0000
- Subject: [Bug nptl/4209] Performance issue: NPTL semaphores work slower than linuxthreads semaphores
- References: <20070317113630.4209.bart.vanassche@gmail.com>
- Reply-to: sourceware-bugzilla at sourceware dot org
------- Additional Comments From bart dot vanassche at gmail dot com 2007-03-18 09:16 -------
(In reply to comment #3)
> And where is a bug? If you have proposals make them. Otherwise go away.
My proposal is to modify sem_post() such that it only calls lll_futex_wake() if
a thread is waiting in sem_wait(). This can be done by adding an atomic counter
to sem_t that represents the number of threads currently waiting inside sem_t.
This makes single-threaded usage of sem_post() and sem_wait() four times faster
[NPTL], and can speed up multithreaded usage of sem_post(). Of course this
optimization is only possible for semaphores used within the same process, and
not for semaphores shared over processes.
$ ./perf2
NPTL
mutex elapsed: 386132 us; per iteration: 38 ns.
semaphore elapsed: 2608915 us; per iteration: 260 ns.
custom semaphore elapsed: 163885 us; per iteration: 16 ns.
semaphore ping-pong elapsed: 19507114 us; per iteration: 1950 ns.
custom semaphore ping-pong elapsed: 10037632 us; per iteration: 1003 ns.
$ LD_LIBRARY_PATH=/home/bart/glibc236/lib: /home/bart/glibc236/lib/ld-linux.so.2
./glibc236-perf2
linuxthreads
mutex elapsed: 537161 us; per iteration: 53 ns.
semaphore elapsed: 1147894 us; per iteration: 114 ns.
custom semaphore elapsed: 156364 us; per iteration: 15 ns.
semaphore ping-pong elapsed: 28256860 us; per iteration: 2825 ns.
custom semaphore ping-pong elapsed: 18066323 us; per iteration: 1806 ns.
Note: because of the nature of this optimization the execution time of the
"custom semaphore ping-pong" test is much more variable than the execution time
of the other tests. It varies between 1000 ns per iteration (context switch time
?) and 2000 ns per iteration (context switch time + futex system call time ?).
--
http://sourceware.org/bugzilla/show_bug.cgi?id=4209
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.