This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: The direction of malloc?
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: Will Newton <will dot newton at linaro dot org>, Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>, libc-alpha <libc-alpha at sourceware dot org>
- Date: Thu, 12 Dec 2013 09:34:01 +0100
- Subject: Re: The direction of malloc?
- Authentication-results: sourceware.org; auth=none
- References: <52A6A0DA dot 1080109 at redhat dot com> <CANu=Dmi32gwk-hQ3dDbj0d4_gs3FWqt02+NmveXH1p03Vm+Mfg at mail dot gmail dot com> <20131210121622 dot GA5416 at domone dot podge> <52A75502 dot 6040500 at linux dot vnet dot ibm dot com> <20131210210541 dot GA19161 at domone dot podge> <20131211023150 dot GA20835 at spoyarek dot pnq dot redhat dot com> <CANu=DmiTFu59qTgP=3Ks6_biCGvGqnis0++mmZdqX6-1FDWaUg at mail dot gmail dot com> <20131212034828 dot GJ20835 at spoyarek dot pnq dot redhat dot com>
On Thu, Dec 12, 2013 at 09:18:28AM +0530, Siddhesh Poyarekar wrote:
> On Wed, Dec 11, 2013 at 09:15:01AM +0000, Will Newton wrote:
> > On 11 December 2013 02:31, Siddhesh Poyarekar <siddhesh@redhat.com> wrote:
> > > On Tue, Dec 10, 2013 at 10:05:41PM +0100, OndÅej BÃlka wrote:
> > >> > * Should we provide thread cache blocks to do provide some lockless allocation?
> > >>
> > >> This is most low-hanging fruit that I aim for. We already use tls to
> > >> determine arena so this should not be a issue.
> > >>
> > >> We have fastbins that sorta do this but with several problems.
> > >> 1. They are not really lockless, for malloc they need a lock, only
> > >> freeing will be when bug 15073 gets fixed.
> > >>
> > >> Second problem is that fastbins are per-arena not per-thread which
> > >> forces us to use atomic operations. These are expensive (typicaly more than 50 cycles).
> > >>
> > >> Moving these to per-thread bins mostly just needs refactoring of current
> > >> code to one that makes more sense.
> > >
> > > With arenas-per-thread, you essentially have contention-free access,
> > > which is not the same thing as lock-free, but not much worse. You'll
> > > have lock contention in per-thread arenas only when there are more
> > > threads than arenas, which in the default case means that you have
> > > more threads than twice the number of cores, which is too many threads
> > > anyway.
> >
> > Lock contention would be worse, but still the atomic instructions
> > required to lock/unlock the arena is the hottest part of the profile
> > on many single-threaded malloc workloads.
> >
> > If we are going to get a new malloc or update the old one I think the
> > fast path being lock-free should be a requirement.
>
> I think I misread Ondrej's post and thought he meant 'lock contention'
> when he actually only mentioned 'cost of atomic operations'. I agree
> that we need a lockless fast path.
>
No, you understood correctly that using atomics or locks is slow. As
Will said "hottest part of the profile on many single-threaded malloc workloads"
its slow but there is no contention possible.