This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/6] Fall back to locking if elision runs out of retries.


On Mon, Sep 16, 2013 at 07:24:23PM +0200, Torvald Riegel wrote:
> On Mon, 2013-09-16 at 14:39 +0200, Dominik Vogt wrote:
> > On Wed, Sep 11, 2013 at 05:42:47PM +0200, Torvald Riegel wrote:
> > > On Mon, 2013-09-02 at 09:59 +0200, Dominik Vogt wrote:
> > > > With the old code, elision would never be disabled if every
> > > > transaction aborts
> > > > with a temporary failure.  This patch introduces a new configuration
> > > > value that
> > > > is used whenever pthread_mutex_lock runs out of retries.
> > > 
> > > I agree that this case can exist, but it's not quite as bad as you make
> > > it sound like.  In particular, if the temporary / retryable aborts are
> > > due to conflict with other threads acquiring the same lock, then in
> > > practice we will disable, but via skip_lock_busy.
> > 
> > I do not think this is the case in general.  The code path that
> > handles _ABORT_LOCK_BUSY is only executed in case of a race when
> > one thread looks at *adapt_count and decides to try elision, but
> > before it can start a transaction another thread using elision on
> > the same lock is aborted and switches to using a real lock.  This
> > needs to happen between
> > 
> >   if *adapt_count <= 0
> > 
> > and
> > 
> >   if (*futex == 0)
> > 
> > Otherwise the transaction will simply abort because of a
> > read/write conflict on *futex, but it won't cause an
> > _ABORT_LOCK_BUSY.
> 
> If no thread acquires the locks without elision, then there won't be any
> conflicts on *futex.

Yes, that's exactly what I described above.  The other thread
needs to switch from elision to using the lock while the current
thread is busy between the two lines quoted above.

>  There might be other conflicts on the same cache
> line of course, but these are similar to ...
> 
> > > However, if we have
> > > retryable aborts due to other threads' actions that don't acquire the
> > > same lock, then you are right that we'll never disable.
> 
> ... these aborts.
> 
> Note that if we're worried about livelocks,

That's not my worry here.

> then it might be better to
> try some kind of back-off instead of incrementing adapt_count because
> the latter would probably lead to something similar to convoying: One
> thread takes the slow path and acquires the lock without elision, which
> leads subsequent threads observing the lock as busy, and forcing the
> slow path as well.

There is certainly a danger that locks with high contention never
get out of non-elision mode because once threads start using locks
there may always be a thread that uses a lock and forces all other
threads to wait on the lock too.  In this situation, the elision
code cannot speed up the program, but it won't slow down the
program significantly either.

On the other hand, in a program with a highly contended mutex,
all aborts may be retryable.  The current code does not adapt to
that situation, and the software may run significantly slower
than necessary because it runs frequently into into the configured
number of retries which all abort because of high contention in
the data touched by the transaction body or the *futex.

> The fact that the current adaptation algorithm does not change
> adapt_count on conflicts could thus very well be actually better than
> forcing the slow path as in what you proposed.

> Can we get some
> measurement data on this before considering this change?

What do you have in mind?  It's fairly easy to write a test case
that shows abysmal performance with the existing code, or a test
case that performs nicely with the current code but does not get
a speed boost with the proposed patch.

> Sure.  The explicit aborts might be different, altough that's still
> under discussion I think.  But you haven't shown how the change you
> suggest here is made necessary by any HTM differences.  Thus, I'd like
> to see more discussion of it, and indication that it's the right
> approach.

Maybe the change makes more sense to you if you consider that
aborts on zEC12 are very expensive?  I foresee that three useless
attempts are a big performance problem on the zEC12; the current
Intel logic is out of the question for z.  And even if aborted
transactions on Haswell perform better on Haswell, I see that
problem there too.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]