This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: zonefile changes and long running processes.


On 05/08/2014 01:31 AM, Paul Eggert wrote:
> Carlos O'Donell wrote:
>> it stands to reason you get either the new locale or the old locale
>> but don't know which.
> 
> It's good to know that glibc enforces this, but it's not clear that
> POSIX requires it.  As far as I can see, if one thread calls
> localtime_r at the same time some other thread is calling tzset, the
> first thread may get the old time zone, or the new one, or some
> indeterminate "in-between" time zone; all that POSIX requires is that
> localtime_r not crash.

I feel like we are confusing both the implementation, and the POSIX
requirements.

The POSIX requirements don't say how localtime_r behaves in the presence
of tzset being called, but the loose requirement is that there be no
data races for non-atomic objects.

Thus it's the implementation-dependent choices which impose upon the
user and should be documented as clearly as possible. A sensible
implementation is one like glibc. We could make glibc's implementation
more MT-friendly by having a single pointer we atomically swap to point
at the new zoneinfo and thus don't slow down localtime_r with locks.

I would argue that this is the "new way" of doing things, lock-free,
with atomic types, etc. These subsystems predate this way of thinking.

> Also, it seems pretty clear that glibc's enforcement of atomicity
> slows down localtime_r: if two threads can't simultaneously run
> localtime_r, one can easily construct scenarios where localtime_r is
> unnecessarily a bottleneck.  It's plausible that some users would
> prefer a faster, non-bottleneck-prone localtime_r to an atomic
> localtime_r, if only because they know their applications will never
> really need the atomicity.

I completely agree. What we have today is a lack of core developers
to work on issues like this. We are only just rebuilding the community,
and realizing we have years and years of deferred maintenance like this
to bring our implementation forward.

> P.S. Have you read the "You don't know jack" paper on shared
> variables?  Some pretty scary stuff there; after reading it I
> couldn't help having some doubts about whether glibc correctly
> enforces localtime_r atomicity.

The localtime_r atomicity is guaranteed by a lock, which must ensure
the correct operation. I have no fear about this. What's more difficult
is to migrate this solution to a lock-free localtime_r using an atomic
pointer to the old or new zoneinfo.

All of that is made easier if we use C11 atomics.

Please put your fears to rest :-)

Funny you mention this, and timely. I have talked to Hans in the
past because hppa has only one atomic operation "load and clear word"
and as Hans is the author of HP's atomic operations library he was
particularly well placed to answer some of my questions for HP PA-RISC
hardware.

Why is your mention timely? Well on hppa I just noticed that in
glibc our atomic_write_barrier() does nothing, and pthread_spin_unlock
does:

atomic_write_barrier()
*lock = 0;

And that breaks if your compare-and-swap is entirely synthesized
by the kernel. The CPUs can reorder *lock=0; as they see fit and
sometimes reorder before *lock=1 (within the synthesized atomic region)
and you lost the unlock and are deadlocked now.

The only solution for hppa or any arch like older arm with
synthesized atomic operations is to use `atomic_exchange_rel`
since it guarantees serialization of the lock and unlock operations
at the cost of performance.

e.g.

process a on cpu 1
semaphore lock S
store to X
purge X
sync (force purge complete)
semaphore release S (via normal store)

process b on cpu 2
semaphore lock S
store to X
...

On hppa in the abscence of the purge/sync it's possible that
process b's write to X may migrate before process A's write to
X and that's incorrect since it creates a data race and destroys
sequential consistency. Why? Because the locks only provide mutual
exclusion, but don't impact reordering or completion of the memory
writes from the cache.

To solve this problem we just use a heavy hammer and serialize
and sync for acquire and release of the semaphores. A more optimal
solution is to get compiler help for a compiler that understands
the semantics of the underlying hardware.

> Boehm H-J, Adve SV. You don't know jack about shared variables or
> memory models. ACM Queue 2011 Dec;9(12):40.
> doi:10.1145/2076796.2088916. 
> http://queue.acm.org/detail.cfm?id=2088916

On a similar vein:
"Threads Cannot be Implemented as a Library"
Hans-J. Boehm.
http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf

Hans has been talking about this stuff for years.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]