[PATCH] Fix fastlocks on SMP

H . J . Lu hjl@valinux.com
Sun Feb 18 09:29:00 GMT 2001


On Sat, Feb 17, 2001 at 11:06:32PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> The following patch seems to cure ex5, ex9 and ex10 on ia64/SMP. Basically, if
> lock->__status had lowest bit set on spin_count 0, it would always spin
> until max_count, since lock->__status was cached in a register and never
> reloaded. __compare_and_swap clobbers it, but the codepath with
> (__status & 1) == 1 skips that, so there is nothing which requires gcc not
> to reload register caching lock->__status only inside of the conditionally
> executed code.
> The patch is attached in two variants, both seem to fix ex5, ex9 and ex10
> (the tests which were previously failing on smp ia64), but the first results
> in better code while the second one is perhaps more readable.
> The assembly difference is in fact only:
> ld8 r15=[r32];;
> in first patch changed to
> ld8.acq r15=[r32];;
> in the second. I don't have ia64 manuals here at home so I cannot check, but
> ld8.acq smells like it would do cache-line ping-pong which is the code
> exactly trying to avoid (by only doing CAS if normal loads tells it could be
> successful).
> 

This may be related to the change we made for __compare_and_swap and
__compare_and_swap_with_release_semantics. The instruction may need
the acquire semantics. We may need to exam all places around
__compare_and_swap.


H.J.



More information about the Libc-hacker mailing list