This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/2] Remove x86 assembler rwlock code


On Mon, Mar 24, 2014 at 09:43:50AM +0100, OndÅej BÃlka wrote:
> On Mon, Mar 17, 2014 at 05:01:31PM -0700, Andi Kleen wrote:
> > From: Andi Kleen <ak@odo.jf.intel.com>
> > 
> > With the recent tuning the C version of rwlocks is basically the same
> > performance as the x86 assembler version for uncontended locks (with a
> > a few cycles near the run-to-run variability). For others it should not
> > matter anyways.
> > 
> > So remove the assembler code and use the C version like other
> > architectures.
> > 
> What benchmark did you used? I would be ok with this when I see data.

A simple benchmark that measured the uncontended performance.
Contended performance is not typically dominated by the actual lock
execution time.

You can see the rdlock is identical, but wrlock is ~6-9 cycles slower.
I originally spent quite some time hunting those 9 cycles, but
then I realized if I run the benchmark many times the run-to-run
variability is higher. So I don't think it's relevant.

With patch:
./obj/testrun.sh ./rwlockbench/micro
rdlock avg 104
wrlock avg 106
rdlock avg 105
wrlock avg 105
rdlock avg 104
wrlock avg 105
rdlock avg 104
...

Without:
./obj-ref/testrun.sh ./rwlockbench/micro
rdlock avg 104
wrlock avg 98
rdlock avg 104
wrlock avg 97
rdlock avg 104
wrlock avg 97
rdlock avg 104
wrlock avg 97

-Andi


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]