This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix sem_post race (bug 14532)


On Tue, Sep 11, 2012 at 5:41 PM, Torvald Riegel <triegel@redhat.com> wrote:
>
> On Thu, 2012-08-30 at 21:09 +0000, Joseph S. Myers wrote:
> > Bug 14532 is a race condition in sem_post (generic C version), where
> > it wrongly uses an acquire barrier rather than a release barrier, so
> > wrongly allowing memory operations from before the unlock to be moved
> > after it.
>
> What is your (or anybody else's) opinion about documenting concurrent
> pieces of code in more detail to make issues like this one less likely?
>
> For example, something like:
>
> > 7e9..67e8cc5 100644
> > --- a/nptl/sysdeps/unix/sysv/linux/sem_post.c
> > +++ b/nptl/sysdeps/unix/sysv/linux/sem_post.c
> > @@ -1,5 +1,5 @@
> >  /* sem_post -- post to a POSIX semaphore.  Generic futex-using version.
> > -   Copyright (C) 2003, 2004, 2007, 2008 Free Software Foundation, Inc.
> > +   Copyright (C) 2003-2012 Free Software Foundation, Inc.
> >     This file is part of the GNU C Library.
> >     Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
> >
> > @@ -40,7 +40,7 @@ __new_sem_post (sem_t *sem)
> >         return -1;
> >       }
> >      }
> > -  while (atomic_compare_and_exchange_bool_acq (&isem->value, cur + 1,
> > cur));
> > +  while (atomic_compare_and_exchange_bool_rel (&isem->value, cur + 1,
> > cur));
>
> /* This rel memory order synchronizes with the acq memory order in
> atomic_decrement_if_positive() called by waiters.  We don't need acq_rel
> here because even though we can race with other posters, we only use the
> CAS to modify isem->value, so reading an outdated value of isem->value
> does not matter. */

I would accept such a patch.

I don't like the use of `acq_rel' and would prefer we come up with some
consistent single name for an operation that has both acquire and release
semantics (totally bike-shed though).

I'm strongly in favour of such comments in the code, in fact I wish all
atomic sequences had such descriptions. Writing them up is quite a bit of
work but goes a long way towards documenting expectations for future
developers.

> Side notes:
> - Could not having acq_rel matter on ARM? I've never read the ARM memory
> model, but heard that there's no guarantee that the most recent value
> would get picked up eventually. Although I suspect that the CAS would
> do.

I don't have an example where not having acq_rel would matter.

In my mind these kinds of problems are best discussed with a specific example,
otherwise it's purely speculation.

> - Should atomic_decrement_if_positive() document the acq memory order
> that is has (and must have)?
>
> >    atomic_full_barrier ();
>
> Why do we have the full barrier after the rel barrier?  Is it necessary?
> Was it meant to have the affect of the rel barrier, but was incorrectly
> placed after the CAS?

I don't know why there is an atomic_full_barrier(), at first glance it
seems unneeded, but as a conservative project we'd have to prove that
it's not needed before removing it. It appears to have been present since
the function was rewritten in 2007 by Ulrich.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]