This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro

From: Torvald Riegel <triegel at redhat dot com>
To: Alexandre Oliva <aoliva at redhat dot com>
Cc: libc-alpha at sourceware dot org, carlos at redhat dot com, mtk dot manpages at gmail dot com
Date: Wed, 20 Nov 2013 15:05:25 +0100
Subject: Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
Authentication-results: sourceware.org; auth=none
References: <20131113081059 dot 3464 dot 51385 dot stgit at frit dot home> <20131113081132 dot 3464 dot 30409 dot stgit at frit dot home> <1384859432 dot 32326 dot 364 dot camel at triegel dot csb> <orsiurva0g dot fsf at livre dot home>

On Wed, 2013-11-20 at 04:54 -0200, Alexandre Oliva wrote:
> On Nov 19, 2013, Torvald Riegel <triegel@redhat.com> wrote:
> 
> > On Wed, 2013-11-13 at 06:11 -0200, Alexandre Oliva wrote:
> 
> >> +Intuitive definition of these properties, that attempt to capture
> >> +the meaning of the standard definitions, follow:
> 
> > Spelling: I believe that you should either drop the (first) comma(s), or
> > use "which" instead of "that".
> 
> Hmm, definitions was supposed to be plural, this may have made it more
> confusing than it was.  How about this:
> 
>   Intuitive definitions of these properties, attempting to capture the
>   meaning of the standard definitions, follow:

That sounds good.

> > If a program calls an MT-Unsafe function and thereby creates a data
> > race, then this data race can manifest in undefined behavior even
> > *before* the abstract machine would make the MT-Unsafe call.  This can
> > happen because the compiler can assume that programs are data-race-free,
> > and use certain optimizations that wouldn't be allowed on (properly
> > synchronizing) MT-Safe versions of the called code.  For example, if
> > there is sequential code after the call but depending on its outcome,
> > then the compiler could speculative and move it before the MT-Unsafe
> > call, which could cause a crash before the call actually happens.
> 
> What you write is true, but I don't think it's relevant.  All glibc
> functions that matter are opaque to their callers, therefore compilers
> couldn't possibly optimize assuming there aren't synchronization
> primitives in them.  Do you have any evidence of the opposite, that
> might justify further obfuscating the definition?

What about linking statically and using LTO?

Such a scenario might not be common today, but I'd prefer to just be
clear about what using MT-Unsafe functions in multi-threaded code can
result in.  It doesn't hurt us to be clear, and we don't gain anything
when we promise more than we can guarantee.

> >> +Each of these unprotected uses will use either the previous or the
> >> +subsequent locale information, so they won't cause crashes or access to
> >> +uninitialized, unmapped or recycled memory.  However, since some cases
> >> +use cached locale information while others access the effective locale
> >> +object anew, concurrent changes to the global locale object may cause
> >> +these functions to behave in ways that they could not behave should the
> >> +execution of @code{setlocale} and of the so-annotated functions be
> >> +atomic, or even should @code{setlocale} alone be atomic.
> 
> > Can you break up this sentence?
> 
> There are several sentences in the quoted paragraph.  Can you point out
> which one you mean, ideally suggesting alternate wording?

However, sometimes cached local information is used whereas in other
cases, the locale object is accessed anew; thus, concurrent changes to
the global locale object may cause these functions to behave in
unexpected ways (i.e., behavior that is not possible if @code{setlocale}
and the so-annotated functions would have been executed in some
sequential order).

I don't quite know what you mean by "or even should @code{setlocale}
alone be atomic" or if that's necessary, so I didn't add it to the
sentence above.

> 
> >> +@item @code{envromt}
> >> +@cindex envromt
> >> +
> >> +Functions marked with @code{envromt} access the environment with
> >> +@code{getenv} or similar, requiring the environment to be effectively
> >> +read-only for MT-Safe operation.
> >> +
> >> +Environment-modifying functions do not protect in any way against
> >> +concurrent modifications or access, so calling @code{envromt}-marked
> >> +functions concurrently with @code{setenv}, @code{putenv},
> >> +@code{unsetenv} or direct modifications of the global environment data
> >> +structures is ill-advised; external concurrency control must be
> >> +introduced by callers of these environment-modifying and
> >> +@code{envromt}-marked functions.
> >> +
> >> +Functions that modify the environment are also marked with
> >> +@code{envromt}, but they are not MT-Safe for the reasons above.  Since
> >> +all environment-modifying functions are MT-Unsafe, functions that only
> >> +access the environment are marked as MT-Safe when no other safety issue
> >> +applies.
> 
> > I know you spelled the exception out here, but I'm wondering whether
> > everyone will remember this when seeing MT-Safe somewhere.  Why didn't
> > you use a different MT-Safe tag (eg, MT-Safe-Env) for those functions,
> > but instead merged them with the "true" MT-Safe?
> 
> Same reason as glocale, really: once all methods to modify something are
> decreed unsafe to call in MT programs, this something becomes constant
> in conformant MT programs, and therefore the functions are indeed
> perfectly safe to call.

That's true, but then you're still merging categories, namely functions
"MT-Safe under constraint X", where X could be the environment stuff, or
something else like glocale.  Or did I misunderstand your categories?

Also, from a readers perspective, mentioning "global" constraints like
environment first before mentioning MT-Safe would be safer even if
TL;DR.

> It occurs to me that it might be useful to distinguish between read-only
> and read-write cases, for example, if there's a 1stcall mark: if you
> call the function in ST mode so as to eliminate that safety issue, does
> envromt indicate the function remains unsafe to call, or that it just
> reads from the environment so that it's fine to call after a first
> single-threaded call.
> 
> >> +
> >> +@item @code{oncesafe}
> >> +@cindex oncesafe
> >> +
> >> +Functions marked with @code{oncesafe} use the internal @code{libc_once}
> >> +machinery or similar to initialize internal data structures.
> >> +
> >> +If a signal handler interrupts such an initializer, and calls any
> >> +function that also performs @code{libc_once} initialization, it will
> >> +deadlock if the thread library is linked in.
> 
> > If "oncesafe" marks cases where a deadlock can happen, shouldn't it be
> > called "onceunsafe" or something like this instead?
> 
> Read it aloud, collapsing the final phoneme from once with the first
> from safe ;-)

:)

> I tried other spellings such as wunsafe, but this one
> didn't bring libc_once to mind.  It's to be understood as unsafe because
> libc_once is unsafe; can you suggest an alternate keyword or spelling?

Well, what about "onceunsafe"?  Do you dislike this option?

> >> +@item @code{1stcall}
> >> +@cindex 1stcall
> >> +
> >> +Functions marked with @code{1stcall} perform thread-unsafe
> >> +initialization when they are first called.
> 
> > I agree with Carlos on this (ie, it sounds as if it's better to be
> > listed as an exception to MT-Safe
> 
> But...  That's what it is!  It's MT-unsafe (1stcall).
> 
> > MT-Safe | oncesafe | 1stcall
> 
> oncesafe is about libc_once initialization.  libc_once is MT-Safe; so
> oncesafe is a note for AS-Unsafe and AC-Unsafe.  1stcall is for
> non-libc_once initializations that are not even MT-Safe.

Agree that oncesafe is a bad example in combination with MT-Safe.

> > with the definition above, you need to add that there could be
> > other restrictions additionally).
> 
> This is the case of all notes; there's no reason to single this one out.

I guess I missed "the only reason" (in your docs) when reading this
first.

> >> +Functions marked with @code{uunguard} modify non-atomically arguments or
> >> +global objects that other functions access without synchronization.  To
> >> +ensure MT- and AS-Safe behavior, callers should refrain from calling
> >> +so-marked functions concurrently with readers of those objects.  A

Just readers or also other "writers"?

> >> +consequence of regarding modifiers of these objects as unsafe is that
> >> +the covered objects can be regarded as constant (subject to the
> >> +observation of safety constraints), so that all readers can be
> >> +considered safe in this regard.
> 
> > AFAIU, uunguard are functions that if called, override the safety
> > properties of MT-Safe or AS-Safe functions?
> 
> They're functions that, if called in MT programs, cause undefined
> behavior, like any MT-Unsafe functions.

If that's the case, why do you have uunguard at all, given that it's
just the negation of MT-Safe (or, be equal to MT-Unsafe)?

> uunguard is a catch-all general case for envromt and glocale, discussed
> earlier in this message.
> 
> > Functions marked with @code{uunguard} modify arguments or global objects
> > that other, potentially MT-Safe or AS-Safe, functions access without
> > synchronization.  Thus, calls to so-marked functions potentially make
> > all concurrent calls to otherwise MT-Safe or AS-Safe functions MT-Unsafe
> > and/or AS-Unsafe.
> 
> That's a consequence of calling any MT-Unsafe function.  No reason to
> single this one out, is there?

Then I don't understand what this category is really about.  Is it
functions that are MT-Unsafe with an additional tag revealing why they
are unsafe?  Is it functions that are MT-Safe except when certain other
functions are called, namely those MT-Unsafe ones that do the
modification?  Or something else?

Second, assuming that uunguard is not unlike MT-Unsafe, why is it listed
as a constraint of MT-Safe functions?

> > There should also be a link in the MT-Safe and AS-Safe paragraphs to
> > this paragraph here, because both depend on this exception (ie, it's
> > only MT-Safe if no uunguard is called concurrently).
> 
> This reasoning is backwards.  Those other functions *are* MT-Safe
> *because* MT-Unsafe functions will not be called.  If they are, all bets
> are off in very many different ways.

Your docs contain:

+@cindex MT-Safe
+MT-Safe functions are safe to call in the presence of other threads.
MT
+stands for Multi Thread.

[more definitions]

+When a function is safe to call only under certain constraints, we will
+add keywords to the safety notes whose meanings are defined as follows:

[other items]

+@item @code{uunguard}
+@cindex uunguard

[description of uunguard that I don't seem to understand]


What I'm saying is that the definition of MT-Safe at the very top should
link to the listing of constraints.  It's true that if you call an
MT-Safe function A and a function B that is MT-Safe-under-constraint-X
concurrently, and X doesn't hold, then this could be considered as
MT-Safe A being mixed with MT-Unsafe B.  But readers could also
interpret the first definition as "MT-Safe regardless".

> 
> >> +@item @code{xguargs}
> >> +@cindex xguargs
> >> +
> >> +Functions marked with @code{xguargs} may use or modify objects passed
> >> +(indirectly) as arguments, without any guards to guarantee consistency.
> >> +To ensure MT- and AS-Safe behavior, callers must ensure that the objects
> >> +passed in are not accessed or modified concurrently by other threads or
> >> +signal handlers.
> >> +
> >> +This mark is only applied with regard to an object when the
> >> +@code{uunguard} mark is not applied because of the same object, and the
> >> +object is opaque or not intended to be modified by code outside of
> >> +@theglibc{}.
> 
> > Wouldn't you need to apply the mark only if the object is not opaque and
> > might be considered to be fine to be modified outside of glibc?
> 
> No, the rationale you quoted explains why *this* note covers the opaque
> or presumed-constant objects.
> 
> > When specifying this more formally, I believe we need to be careful to
> > specify the "lifetime" of this constraint (eg, is it just for the
> > duration of this call?).
> 
> The description in the first paragraph permits this reading, yeah.
> E.g., functions that hold pointers to passed-in (sub)objects and later
> on, asynchronously, proceed to modify them, would be marked with a
> broader, bigger red flag ;-)
> 
> >> +@item @code{tempsig}
> >> +@cindex tempsig
> 
> >> +Functions marked with @code{tempsig} may temporarily install signal
> >> +handlers for internal purposes, which may interfere with other uses of
> >> +those signals.  
> 
> >> +This makes such functions MT-Unsafe and AS-Unsafe to begin with.
> 
> > Why does it make such functions MT-Unsafe in every case?
> 
> Err...  Because they may interfere with other uses of those signals, and
> there's not much one can do to avoid this interference.  Calling the
> function concurrent from two different threads is a no-no.  Heck, since
> signals are not thread-aware, it's not even safe to call the function in
> a single thread; the signal it expects may be delivered to a different
> thread, failing to fulfill its purpose.

But that doesn't mean that the signal handler must lead to bad behavior.
I was mainly wondering why installing a temporary signal handler would
always lead to this being MT-Unsafe.  I suppose that there are ways to
not make that MT-Unsafe, at least as far as glibc is concerned.

> > Spelling: cummulative -> cumulative, I believe.
> 
> Thanks.
> 
> > Given that we control the allocator, I believe that it can be made
> > atomic (without disabling signals or reentrancy).
> 
> That's not enough; we'd have to make *storing* the address of the
> allocated memory into its destination an atomic part of the allocation
> atomic unit, otherwise memory would still leak after we returned it, if
> cancellation occurred before the returned value was stored in the
> destination.

One can build wait-free atomic updates.

> The existing APIs just doesn't make memleak-safe
> implementations possible.

They aren't sufficient, but I guess they could be extended to make them
sufficient.

> >> +Functions annotated with @code{lockleak} may leak locks if asynchronous
> >> +thread cancellation interrupts their execution.
> 
> > So this is not a leaking lock, but a non-cleaned-up acquisition of a
> > lock?  "lockacquisitionleak" is longer, but might be more clear?  Or
> > "lockacqleak"?
> 
> They take a lock and don't release it, so the taken lock leaks.  I
> understand the distinction you make, but I fail to see what good would
> come out of this sort of hair splitting.  Like, what else might lockleak
> be expected to mean?

That the lock object itself leaks (eg, that it's not destroyed).  Which,
I guess, might matter even for a robust lock (where an acquisition could
"leak" and we might still have some way to reclaim it).

> >> +@item @code{asynconsist}
> >> +@cindex asynconsist
> >> +
> >> +Functions marked with @code{asynconsist} take a recursive lock to ensure
> >> +MT-Safety while accessing or modifying data structures guarded by the
> >> +lock.
> 
> > Should this be "asyncconcist" (ie, don't drop one of the "c"'s)?
> > I was reading "async-on-sist" and wondering what it means...
> 
> It's a partial overlap of async (signal) and inconsist (for
> inconsistency).  Doubling the c wouldn't make it any less misleading ;-)

Alright, why not async-inconsist, or asyncinconsist then?

> >> +Functions marked with @code{incansist} modify data structures in a
> >> +non-atomic way.
> 
> > What does "incansist" stand for?
> 
> potential
> inconsistency
>    ^
> if |
>  *can*celled
> 
> 
> The difficulties in guessing the precise meanings of the keywords, the
> risk of their misleading as to the issue they refer to, and even my own
> difficulty in keeping the precise meanings in mind and avoiding their
> drift over time might indicate that made-up keywords weren't such a
> great idea, and we might be better off with some numerical rather than
> symbolic code.  Personally, I prefer symbols over numbers, but then, I'm
> biased ;-)  Your thoughts?

Just longer keywords?  I'd prefer something that has a somewhat
descriptive name over a number.

Follow-Ups:
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Alexandre Oliva
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Alexandre Oliva

References:
- [MTASCsft PATCH WIP5 00/33] MT-, AS- and AC-Safety docs
  - From: Alexandre Oliva
- [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Alexandre Oliva
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Torvald Riegel
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Alexandre Oliva

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]