This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: The direction of malloc?
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>, libc-alpha at sourceware dot org
- Date: Thu, 19 Dec 2013 15:19:49 +0100
- Subject: Re: The direction of malloc?
- Authentication-results: sourceware.org; auth=none
- References: <20131210121622 dot GA5416 at domone dot podge> <52A75502 dot 6040500 at linux dot vnet dot ibm dot com> <20131210210541 dot GA19161 at domone dot podge> <1387213140 dot 23049 dot 8010 dot camel at triegel dot csb> <20131216212334 dot GA21284 at domone dot podge> <1387285197 dot 23049 dot 9075 dot camel at triegel dot csb> <20131217190817 dot GA32756 at domone dot podge> <1387324884 dot 23049 dot 10462 dot camel at triegel dot csb> <20131218121144 dot GA7787 at domone dot podge> <1387376049 dot 23049 dot 11021 dot camel at triegel dot csb>
On Wed, Dec 18, 2013 at 03:14:09PM +0100, Torvald Riegel wrote:
> On Wed, 2013-12-18 at 13:11 +0100, OndÅej BÃlka wrote:
> > On Wed, Dec 18, 2013 at 01:01:24AM +0100, Torvald Riegel wrote:
> > > On Tue, 2013-12-17 at 20:08 +0100, OndÅej BÃlka wrote:
> > > > On Tue, Dec 17, 2013 at 01:59:57PM +0100, Torvald Riegel wrote:
> > > > > On Mon, 2013-12-16 at 22:23 +0100, OndÅej BÃlka wrote:
> > > > > > Please explain how spinning could improve performance in single thread
> > > > > > applications.
> > > > >
> > > > > You spoke about lockless code, so obviously concurrent code. My comment
> > > > > was thus referring to concurrent code. If you have a single-threaded
> > > > > program, then you can avoid synchronization, obviously (ignoring
> > > > > synchronization just for reentrancy...).
> > > > >
> > > > And we for malloc use a switch variable to avoid lock path and set it
> > > > when pthread_create is called? For reentancy a ordinary variable suffices.
> > >
> > > Depending on the algorithm, even for reentrancy you might need atomic
> > > operations (eg, to keep under control what the compiler does with the
> > > code, or using a CAS to avoid pending stores).
> >
> > For reordering barrier it suffices to use
> >
> > __asm__ __volatile__( "" : : :"memory");
>
> You can use that to constrain what the compiler does, I agree. But why
> not use relaxed-memory-order atomic accesses right away, instead of
> trying to build the same thing manually with the compiler reorder
> barriers?
>
> > Could a pending write really happen, it is kernel responsibility to do
> > serialization which includes waiting for pending stores?
>
> I mean "pending" on a conceptual, synchronization-related level. It
> refers to a situation in which the thread is suspended (due to
> reentrancy, or also in concurrent settings) right before issuing a
> store. Because there's no operation before the store, it will always
> overwrite a change someone else might have done (eg, the signal
> handler), and there's no way to recover from that. Unless the userspace
> code can constrain where it is interrupted, there's nothing the kernel
> or the HW can do to avoid that. To avoid the pending-store problem, you
> use atomic read-modify-write ops or CAS.
>
> For example, you need those even for reentrancy-safe recursive locks;
> you can't just do
> if (lock.owner == NULL) { lock.owner = me; }
> because the store to lock.owner can be a pending store, which would
> overwrite whoever actually acquired the lock. You would need a CAS or
> similar.
We are at single thread case. You do not need these for detecting
recursive behaviour, if you want use CAS then you need to use inline
assembly anyway as gcc inserts useless lock prefix there.