This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 12/14] Add manual for lock elision
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Andi Kleen <andi at firstfloor dot org>
- Cc: libc-alpha at sourceware dot org, Andi Kleen <ak at linux dot jf dot intel dot com>
- Date: Sun, 30 Jun 2013 17:19:37 -0400
- Subject: Re: [PATCH 12/14] Add manual for lock elision
- References: <1372452807-25216-1-git-send-email-andi at firstfloor dot org> <1372452807-25216-13-git-send-email-andi at firstfloor dot org>
On 06/28/2013 04:53 PM, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> pthreads are not described in the documentation, but I decided to document
> lock elision there at least.
One suggested wording change to mention `static initializers'.
This manual change is predicated on the new API changes.
> 2013-06-18 Andi Kleen <ak@linux.intel.com>
>
> * manual/Makefile: Add elision.texi.
> * manual/threads.texi: Link to elision.
> * manual/elision.texi: New file.
> * manual/intro.texi: Link to elision.
> * manual/lang.texi: dito.
> ---
> manual/Makefile | 2 +-
> manual/elision.texi | 208 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> manual/intro.texi | 3 +
> manual/lang.texi | 2 +-
> manual/threads.texi | 2 +-
> 5 files changed, 214 insertions(+), 3 deletions(-)
> create mode 100644 manual/elision.texi
>
> diff --git a/manual/Makefile b/manual/Makefile
> index 44c0fd4..5d78761 100644
> --- a/manual/Makefile
> +++ b/manual/Makefile
> @@ -42,7 +42,7 @@ chapters = $(addsuffix .texi, \
> message search pattern io stdio llio filesys \
> pipe socket terminal syslog math arith time \
> resource setjmp signal startup process job nss \
> - users sysinfo conf crypt debug threads)
> + users sysinfo conf crypt debug threads elision)
> add-chapters = $(wildcard $(foreach d, $(add-ons), ../$d/$d.texi))
> appendices = lang.texi header.texi install.texi maint.texi platform.texi \
> contrib.texi
> diff --git a/manual/elision.texi b/manual/elision.texi
> new file mode 100644
> index 0000000..ecd45e3
> --- /dev/null
> +++ b/manual/elision.texi
> @@ -0,0 +1,208 @@
> +@node Lock elision, Language Features, POSIX Threads, Top
> +@c %MENU% Lock elision
> +@chapter Lock elision
> +
> +@c create the bizarre situation that lock elision is documented, but pthreads isn't
> +
> +This chapter describes the elided lock implementation for POSIX thread locks.
> +
> +@menu
> +* Lock elision introduction:: What is lock elision?
> +* Semantic differences of elided locks::
> +* Tuning lock elision::
> +* Setting elision for individual @code{pthread_mutex_t}::
> +* Setting elision for individual @code{pthread_rwlock_t}::
> +@end menu
> +
> +@node Lock elision introduction
> +@section Lock elision introduction
> +
> +Lock elision is a technique to improve lock scaling. It runs
> +lock regions in parallel using hardware support for a transactional execution
> +mode. The lock region is executed speculatively, and as long
> +as there is no conflict or other reason for transaction abort the lock
> +will executed in parallel. If an transaction abort occurs, any
> +side effect of the speculative execution is undone, the lock is taken
> +for real and the lock region re-executed. This improves scalability
> +of the program because locks do not need to wait for each other.
> +
> +The standard @code{pthread_mutex_t} mutexes and @code{pthread_rwlock_t} rwlocks
> +can be transparently elided by @theglibc{}.
> +
> +Lock elision may lower performance if transaction aborts occur too frequently.
> +In this case it is recommended to use a PMU profiler to find the causes for
> +the aborts first and try to eliminate them. If that is not possible
> +elision can be disabled for a specific lock or for the whole program.
> +Alternatively elision can be disabled completely, and only enabled for
> +specific locks that are known to be elision friendly.
> +
> +The defaults locks are adaptive. The library decides whether elision
> +is profitable based on the abort rates, and automatically disables
> +elision for a lock when it aborts too often. After some time elision
> +is re-tried, in case the workload changed.
> +
> +Lock elision is currently supported for default (timed) mutexes, and
> +rwlocks. Other lock types (including @code{PTHREAD_MUTEX_NORMAL}) do not elide.
> +Condition variables also do not elide. This may change in future versions.
> +
> +@node Semantic differences of elided locks
> +@section Semantic differences of elided locks
> +
> +Elided locks have some semantic (visible) differences to classic locks. These differences
> +are only visible when the lock is successfully elided. Since elision may always
> +fail a program cannot rely on any of these semantics.
> +
> +@itemize
> +@item
> +timedlocks may not time out.
> +
> +@smallexample
> +pthread_mutex_lock (&lock);
> +if (pthread_mutex_timedlock (&lock, &timeout) == 0)
> + /* With elision we always come here */
> +else
> + /* With no elision we always come here because timeout happens. */
> +@end smallexample
> +
> +Similar semantic changes apply to @code{pthread_rwlock_trywrlock} and
> +@code{pthread_rwlock_timedwrlock}.
> +
> +A program like
> +
> +@smallexample
> +/* lock is not a recursive lock type */
> +pthread_mutex_lock (&lock);
> +/* Relock same lock in same thread */
> +pthread_mutex_lock (&lock);
> +@end smallexample
> +
> +will immediately hang on the second lock (dead lock) without elision. With
> +elision the deadlock will only happen on an abort, which can happen
> +early or could happen later, but will likely not happen every time.
> +
> +This behavior is allowed in POSIX for @code{PTHREAD_MUTEX_DEFAULT}, but not for
> +@code{PTHREAD_MUTEX_NORMAL}. When @code{PTHREAD_MUTEX_NORMAL} is
> +set for a mutex using @code{pthread_mutexattr_settype} elision is implicitly
> +disabled. Note that @code{PTHREAD_MUTEX_INITIALIZER} sets a
> +@code{PTHREAD_MUTEX_DEFAULT} type, thus allows elision.
> +
> +Depending on the ABI version @theglibc{} may not distinguish between
> +@code{PTHREAD_MUTEX_NORMAL} and @code{PTHREAD_MUTEX_DEFAULT}, as they may
> +have the same numerical value. If that is the case any call to
> +@code{pthread_mutexattr_settype} with either type will disable elision.
> +
> +@item
> +@code{pthread_mutex_destroy} does not return an error when the lock is locked
> +and will clear the lock state.
> +
> +@item
> +@code{pthread_mutex_t} and @code{pthread_rwlock_t} appear free from other threads.
> +
> +This can be visible through trylock or timedlock.
> +In most cases checking this is a existing latent race in the program, but there may
> +be cases when it is not.
> +
> +@item
> +@code{EAGAIN} and @code{EDEADLK} in rwlocks will not happen under elision.
> +
> +@item
> +@code{pthread_mutex_unlock} does not return an error when unlocking a free lock.
> +
> +@item
> +Elision changes timing because locks now run in parallel.
> +Timing differences may expose latent race bugs in the program. Programs using time based synchronization
> +(as opposed to using data dependencies) may change behavior.
> +
> +@end itemize
> +
> +@node Tuning lock elision
> +@section Tuning lock elision
> +
> +Critical regions may need some tuning to get the benefit of lock elision.
> +This is based on the abort rates, which can be determined by a PMU profiler
> +(e.g. perf on @gnulinuxsystems{}). When the abort rate is too high lock
> +scaling will not improve. Generally lock elision feedback should be done
> +only based on profile feedback.
> +
> +Most of these optimizations will improve performance even without lock elision
> +because they will minimize cache line bouncing between threads or make
> +lock regions smaller.
> +
> +Common causes of transactional aborts:
> +
> +@itemize
> +@item
> +Not elidable operations like system calls, IO, CPU exceptions.
> +
> +Try to move out of the critical section when common. Note that these often happen at program startup only.
> +@item
> +Global statistic counts
> +
> +Global statistic variables tend to cause conflicts. Either disable, or make per thread or as a last resort sample
> +(not update every operation)
> +@item
> +False sharing of variables or data structures causing conflicts with other threads
> +
> +Add padding as needed.
> +@item
> +Other conflicts on the same cache lines with other threads
> +
> +Minimize conflicts with other threads. This may require changes to the data structures.
> +@item
> +Capacity overflow
> +
> +The memory transaction used for lock elision has a limited capacity. Make the critical region smaller
> +or move operations that do not need to be protected by the lock outside.
> +
> +@item
> +Rewriting already set flags
> +
> +Setting flags or variables in shared objects that are already set may cause conflicts. Add a check
> +to only write when the value changed.
> +
> +@item
> +Using @code{pthread_mutex_trylock} or @code{pthread_rwlock_trywrlock}
> +nested in another elided lock.
> +
> +@end itemize
> +
> +@node Setting elision for individual @code{pthread_mutex_t}
> +@section Setting elision for individual @code{pthread_mutex_t}
> +
> +Elision can be explicitly disabled or enabled for each @code{pthread_mutex_t} in the program.
> +The elision flags can only be set at runtime using @code{pthread_mutexattr_setelision_np} and
> +@code{pthread_mutex_init}.
> There is no support for initializers for them.
Suggest:
There is currently no support for static initializers.
> +
> +@smallexample
> +/* Force lock elision for a mutex */
> +pthread_mutexattr_t attr;
> +pthread_mutexattr_init (&attr);
> +pthread_mutexattr_setelision_np (&attr, 1);
> +pthread_mutex_init (&object->mylock, &attr);
> +@end smallexample
> +
> +@smallexample
> +/* Force no lock elision for a mutex */
> +pthread_mutexattr_t attr;
> +pthread_mutexattr_init (&attr);
> +pthread_mutexattr_setelision_np (&attr, 0);
> +pthread_mutex_init (&object->mylock, &attr);
> +@end smallexample
> +
> +Setting a @code{PTHREAD_MUTEX_NORMAL} lock type will also disable elision.
> +In some versions of the library any call to @code{pthread_mutexattr_settype}
> +may also disable elision for that lock.
> +
> +@node Setting elision for individual @code{pthread_rwlock_t}
> +@section Setting elision for individual @code{pthread_rwlock_t}
> +
> +Elision can be explicitly disabled or enabled for each @code{pthread_rwlock_t} in the program using @code{pthread_rwlockattr_setelision_np}.
> +
> +@smallexample
> +/* Force lock elision for a dynamically allocated rwlock */
> +pthread_rwlockattr_t rwattr;
> +pthread_rwlockattr_init (&rwattr);
> +pthread_rwlockattr_setelision_np (&rwattr, 1);
> +pthread_rwlock_init (&object->myrwlock, &rwattr);
> +@end smallexample
> +
> diff --git a/manual/intro.texi b/manual/intro.texi
> index deaf089..5914035 100644
> --- a/manual/intro.texi
> +++ b/manual/intro.texi
> @@ -703,6 +703,9 @@ information about the hardware and software configuration your program
> is executing under.
>
> @item
> +@ref{Lock elision} describes elided locks in POSIX threads.
> +
OK.
> +@item
> @ref{System Configuration}, tells you how you can get information about
> various operating system limits. Most of these parameters are provided for
> compatibility with POSIX.
> diff --git a/manual/lang.texi b/manual/lang.texi
> index ee04e23..72e06b0 100644
> --- a/manual/lang.texi
> +++ b/manual/lang.texi
> @@ -1,6 +1,6 @@
> @c This node must have no pointers.
> @node Language Features
> -@c @node Language Features, Library Summary, , Top
> +@c @node Language Features, Library Summary, Lock elision, Top
OK.
> @c %MENU% C language features provided by the library
> @appendix C Language Facilities in the Library
>
> diff --git a/manual/threads.texi b/manual/threads.texi
> index a23ac26..f58ea6e 100644
> --- a/manual/threads.texi
> +++ b/manual/threads.texi
> @@ -1,5 +1,5 @@
> @node POSIX Threads
> -@c @node POSIX Threads, , Cryptographic Functions, Top
> +@c @node POSIX Threads, Lock elision, Cryptographic Functions, Top
OK.
> @chapter POSIX Threads
> @c %MENU% POSIX Threads
> @cindex pthreads
>