This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.
- From: Dominik Vogt <vogt at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Wed, 3 Jul 2013 13:04:19 +0200
- Subject: Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.
- References: <1372452807-25216-1-git-send-email-andi at firstfloor dot org> <1372452807-25216-10-git-send-email-andi at firstfloor dot org> <51D09CA6 dot 7060700 at redhat dot com> <20130630215353 dot GL6123 at two dot firstfloor dot org> <51D20715 dot 7080704 at redhat dot com> <20130702005420 dot GU6123 at two dot firstfloor dot org> <51D302C9 dot 6030002 at redhat dot com> <20130703071033 dot GB5195 at linux dot vnet dot ibm dot com> <1372841085 dot 22198 dot 5193 dot camel at triegel dot csb>
- Reply-to: Libc-Alpha at sourceware dot org
On Wed, Jul 03, 2013 at 10:44:45AM +0200, Torvald Riegel wrote:
> On Wed, 2013-07-03 at 09:10 +0200, Dominik Vogt wrote:
> > The lock elision
> > concept is too far away from the hardware instructions, and too
> > elusive as a concept from the point of view of the application
> > programmer to be a good aread for optimization. The programmer
> > will probably want to use a transaction interface in the language
> > rather than relying on library code.
>
> I don't agree. All the HTMs I'm aware of do not give any
> forward-progress guarantee (ie, that transactions without conflicts will
> eventually commit and not abort due to any obscure reason).
The zEC12 implements "constrained" transactions that guarantee
completion of a transaction eventually (TBEGINC instruction). But
the code that can be used inside a constrained transaction is
severely limited in size, memory access, and valid instructions
that it is probably only useable in hand written assembler code.
> Thus, if a programmer is using
> transactions, he/she needs a fallback execution path that does not need
> to rely on the HTM.
> Having a lock-based implementation as fallback is often a pretty
> practical idea: people likely have some experience with locks, they
> aren't a new concept, and their behavior in terms of performance is not
> too hard to understand. So, if the programmer decides to use lock-based
> code as fallback, he/she will have to build something that's similar to
> lock elision.
> Other options are (1) custom concurrent code, which is often too
> difficult to implement, especially for non-experts, or (2) transactions
> implemented in software without use of the HTM (ie, STM), or (3) perhaps
> other abstractions such as combinations of atomic snapshots and updates,
> RCU, or whatever.
> If you implement programming-language-level transactions (e.g.,
> transaction statements in C/C++ as supported by GCC), you need a
> fallback too. You can use STMs, but they come with some overhead and if
> you want to run STM transactions concurrently with HTM transactions,
> you're often slowing down the HTM code paths too because you need to do
> additional stuff to synchronize with the STM. The other option is to
> use a single global lock as fallback, but that can obviously limit
> scalability quickly. If you want more fine-grained locking, you could
> either ask the programmer to supply a locking scheme (but this kills
> lots of the ease-of-use of transactions), or try to let the compiler
> infer a fine-grained locking scheme (there is research on this, but it's
> hard because it needs a good points-to analysis, and works best with
> whole-program analysis).
All these are valid points. I'd just like to point out some more
things that may well have an impact on which programs can benefit
from HTM in the future and how.
* There are many potential reasons why transactions can fail,
even if using them seemed to be a good thing at first. For
example, existing implementations of data structures may
contain statistics like element counts; programs may calculate
global statistics inside locks; programs may not use locks
often, or only for short times; contention between transactions
may be too low or too high; it may not be possible to
parallelize algorithms efficiently, etc. These problems may
not be unsurmountable, but existing software has to be analyzed
and modified.
* Even if a an attempt to use a transaction is worthwhile in a
specific case, it is very difficult to automatically decide
that.
* Today's software was written and optimized without HTM in mind
because HTM did not exist. Thus, many applications are
optimized in a way that reduces the benefit of transactions
that replace locking. For example, without transactions the
following code may be efficient:
lock()
x++;
unlock();
foo()
lock()
y++;
unlock();
bar()
lock()
z++;
unlock();
(foo() and bar() do not contend but take a long time)
But with transactions it may be more efficient to write
begin transaction
x++;
foo()
y++;
bar()
z++;
end transaction
Because it adds the penalty for the transaction only once.
* As a result, a lot of existing software will not benefit from
HTM at all, and some software can benefit only if carefully
rewritten with HTM in mind. It is by no means guaranteed that
there is any relevant software that would benefit from
automatically used library code at all. There is a chance that
nobody gets anything for free.
* Libraries need to use abstract interfaces to HTM. But the HTM
implementations and interfaces in hardware (z, intel, power,
bg/q etc.) are very different, and an approach that works well
for one cpu may not be good for a different one. For example,
the libitm interfaces seem to be very awkward for z (according
to a colleague working on supporting z in libitm).
Ciao
Dominik ^_^ ^_^
--
Dominik Vogt
IBM Germany