This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.


On Wed, Jul 03, 2013 at 10:44:45AM +0200, Torvald Riegel wrote:
> On Wed, 2013-07-03 at 09:10 +0200, Dominik Vogt wrote:
> > The lock elision
> > concept is too far away from the hardware instructions, and too
> > elusive as a concept from the point of view of the application
> > programmer to be a good aread for optimization.  The programmer
> > will probably want to use a transaction interface in the language
> > rather than relying on library code.
> 
> I don't agree.  All the HTMs I'm aware of do not give any
> forward-progress guarantee (ie, that transactions without conflicts will
> eventually commit and not abort due to any obscure reason).

The zEC12 implements "constrained" transactions that guarantee
completion of a transaction eventually (TBEGINC instruction).  But
the code that can be used inside a constrained transaction is
severely limited in size, memory access, and valid instructions
that it is probably only useable in hand written assembler code.

> Thus, if a programmer is using
> transactions, he/she needs a fallback execution path that does not need
> to rely on the HTM.
> Having a lock-based implementation as fallback is often a pretty
> practical idea: people likely have some experience with locks, they
> aren't a new concept, and their behavior in terms of performance is not
> too hard to understand.  So, if the programmer decides to use lock-based
> code as fallback, he/she will have to build something that's similar to
> lock elision.
> Other options are (1) custom concurrent code, which is often too
> difficult to implement, especially for non-experts, or (2) transactions
> implemented in software without use of the HTM (ie, STM), or (3) perhaps
> other abstractions such as combinations of atomic snapshots and updates,
> RCU, or whatever.
> If you implement programming-language-level transactions (e.g.,
> transaction statements in C/C++ as supported by GCC), you need a
> fallback too.  You can use STMs, but they come with some overhead and if
> you want to run STM transactions concurrently with HTM transactions,
> you're often slowing down the HTM code paths too because you need to do
> additional stuff to synchronize with the STM.  The other option is to
> use a single global lock as fallback, but that can obviously limit
> scalability quickly.  If you want more fine-grained locking, you could
> either ask the programmer to supply a locking scheme (but this kills
> lots of the ease-of-use of transactions), or try to let the compiler
> infer a fine-grained locking scheme (there is research on this, but it's
> hard because it needs a good points-to analysis, and works best with
> whole-program analysis).

All these are valid points.  I'd just like to point out some more
things that may well have an impact on which programs can benefit
from HTM in the future and how.

 * There are many potential reasons why transactions can fail,
   even if using them seemed to be a good thing at first.  For
   example, existing implementations of data structures may
   contain statistics like element counts; programs may calculate
   global statistics inside locks; programs may not use locks
   often, or only for short times; contention between transactions
   may be too low or too high; it may not be possible to
   parallelize algorithms efficiently, etc.  These problems may
   not be unsurmountable, but existing software has to be analyzed
   and modified.

 * Even if a an attempt to use a transaction is worthwhile in a 
   specific case, it is very difficult to automatically decide
   that.

 * Today's software was written and optimized without HTM in mind
   because HTM did not exist.  Thus, many applications are
   optimized in a way that reduces the benefit of transactions
   that replace locking.  For example, without transactions the
   following code may be efficient:

     lock()
       x++;
     unlock();
     foo()
     lock()
       y++;
     unlock();
     bar()
     lock()
       z++;
     unlock();

   (foo() and bar() do not contend but take a long time)

   But with transactions it may be more efficient to write

     begin transaction
       x++;
       foo()
       y++;
       bar()
       z++;
     end transaction

   Because it adds the penalty for the transaction only once.

 * As a result, a lot of existing software will not benefit from
   HTM at all, and some software can benefit only if carefully
   rewritten with HTM in mind.  It is by no means guaranteed that
   there is any relevant software that would benefit from
   automatically used library code at all.  There is a chance that
   nobody gets anything for free.

 * Libraries need to use abstract interfaces to HTM.  But the HTM
   implementations and interfaces in hardware (z, intel, power,
   bg/q etc.) are very different, and an approach that works well
   for one cpu may not be good for a different one.  For example,
   the libitm interfaces seem to be very awkward for z (according
   to a colleague working on supporting z in libitm).

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]