This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] calloc should not duplicate malloc logic.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 26 Feb 2014 17:25:21 +0100
- Subject: Re: [PATCH] calloc should not duplicate malloc logic.
- Authentication-results: sourceware.org; auth=none
- References: <20140221150417 dot GA4198 at domone dot podge> <20140226143648 dot GA32752 at spoyarek dot pnq dot redhat dot com>
On Wed, Feb 26, 2014 at 08:06:48PM +0530, Siddhesh Poyarekar wrote:
> On Fri, Feb 21, 2014 at 04:04:17PM +0100, OndÅej BÃlka wrote:
> > To make future improvements of allocator simpler we could for now calloc
> > just call malloc and memset. With that we could omit a changes that
> > would duplicate malloc changes anyway.
> >
> > It would temporarily decrease its performance, which is not primary
> > concern now as we just started release cycle.
>
> I don't see the point of making changes that cause performance
> regressions on the basis of a future hypothetical allocator that would
Siddhesh, before you speak you need check the facts. Use a benchmark
so you could quantify if it is worth of concern or difference will just
disappear in noise.
As there is currently not a benchmark I modified my malloc benchmark and
results look mostly indistinguishible.
Also for subpage requests trying not to clear memory is bad idea anyway
as application will write to that area and it will act as prefetching
whcih could sometimes improve overall performance.
> hypothetically be reviewed and approved in time for 2.20 release. It
> fails the test of being a contained changeset that brings about a net
> improvement in the code.
>
A performance is not primary concern now, maintainability is. As in
previous cauldron improving malloc was one of main points for
improvement. If we do not move quickly then we will not get lot done.
> In other words, please make these changes in your private branch, do
> your allocator implementation there and then start posting patches
> that as a set would bring about a net improvement. Right now, this
> one patch does not. See Alex's work on MT safety docs for an example
> of how one ought to build upon and post large changes to the code
> base. It's a lot of work, but it's unavoidable if you want to keep
> the tree sane for everyone.
>
> > - return mem;
> > + return memset (mem, 0, bytes);
>
> You're calling memset even for maps that have been mmapped (and hence
> already zeroed), which is a really horrible thing to do. I can't
> imagine any allocator implementation that would require a
> simplifaction like this.
>
Siddhesh a following program already consumes a gigabyte in rss so your
claim is nonsense again.
#include <malloc.h>
int main()
{
char *x = calloc (1000000000,1);
sleep(1000);
}
> I see that the patch is already committed. I think it should be
> reverted.
You need to base your decisions on facts not misconsceptions.
Old implementation numbers are
allocations in range 1-8: 116.439
allocations in range 2-16: 113.176
allocations in range 4-32: 129.151
allocations in range 8-64: 148.03
allocations in range 16-128: 172.019
allocations in range 32-256: 241.726
allocations in range 64-512: 301.791
allocations in range 128-1024: 440.151
allocations in range 256-2048: 706.127
allocations in range 512-4096: 1252.86
allocations in range 1024-8192: 2590.85
allocations in range 2048-16384: 5307.75
allocations in range 4096-32768: 10866.7
allocations in range 1-8: 115.602
allocations in range 2-16: 113.477
allocations in range 4-32: 129.746
allocations in range 8-64: 149.746
allocations in range 16-128: 173.371
allocations in range 32-256: 240.015
allocations in range 64-512: 301.413
allocations in range 128-1024: 434.218
allocations in range 256-2048: 689.137
allocations in range 512-4096: 1218.65
allocations in range 1024-8192: 2531.41
allocations in range 2048-16384: 5181.56
allocations in range 4096-32768: 10697.3
allocations in range 1-8: 115.553
allocations in range 2-16: 113.357
allocations in range 4-32: 128.756
allocations in range 8-64: 148.953
allocations in range 16-128: 173.219
allocations in range 32-256: 240.545
allocations in range 64-512: 301.595
allocations in range 128-1024: 433.737
allocations in range 256-2048: 688.729
allocations in range 512-4096: 1221.62
allocations in range 1024-8192: 2533.32
allocations in range 2048-16384: 5181.41
allocations in range 4096-32768: 10697.1
allocations in range 1-8: 115.494
allocations in range 2-16: 113.288
allocations in range 4-32: 128.618
allocations in range 8-64: 148.702
allocations in range 16-128: 173.045
allocations in range 32-256: 239.946
allocations in range 64-512: 301.664
allocations in range 128-1024: 434.581
allocations in range 256-2048: 690.666
allocations in range 512-4096: 1215.94
allocations in range 1024-8192: 2530.69
allocations in range 2048-16384: 5183.24
allocations in range 4096-32768: 10696.7
allocations in range 1-8: 115.769
allocations in range 2-16: 113.532
allocations in range 4-32: 129.759
allocations in range 8-64: 149.928
allocations in range 16-128: 173.378
allocations in range 32-256: 239.979
allocations in range 64-512: 301.394
allocations in range 128-1024: 434.942
allocations in range 256-2048: 695.047
allocations in range 512-4096: 1218.08
allocations in range 1024-8192: 2532.12
allocations in range 2048-16384: 5182.49
allocations in range 4096-32768: 10699.2
And new one has
allocations in range 1-8: 123.532
allocations in range 2-16: 119.636
allocations in range 4-32: 126.008
allocations in range 8-64: 137.936
allocations in range 16-128: 163.974
allocations in range 32-256: 254.418
allocations in range 64-512: 299.129
allocations in range 128-1024: 436.888
allocations in range 256-2048: 704.616
allocations in range 512-4096: 1262.59
allocations in range 1024-8192: 2701
allocations in range 2048-16384: 5474.95
allocations in range 4096-32768: 11667.6
allocations in range 1-8: 142.738
allocations in range 2-16: 119.409
allocations in range 4-32: 125.556
allocations in range 8-64: 142.533
allocations in range 16-128: 226.518
allocations in range 32-256: 265.232
allocations in range 64-512: 314.735
allocations in range 128-1024: 509.436
allocations in range 256-2048: 815.627
allocations in range 512-4096: 1294.84
allocations in range 1024-8192: 2628.28
allocations in range 2048-16384: 5562.15
allocations in range 4096-32768: 11176.7
allocations in range 1-8: 138.322
allocations in range 2-16: 120.867
allocations in range 4-32: 125.341
allocations in range 8-64: 137.924
allocations in range 16-128: 193.724
allocations in range 32-256: 238.415
allocations in range 64-512: 300.145
allocations in range 128-1024: 431.434
allocations in range 256-2048: 690.693
allocations in range 512-4096: 1232.5
allocations in range 1024-8192: 2591.6
allocations in range 2048-16384: 5342.07
allocations in range 4096-32768: 11156.6
allocations in range 1-8: 122.718
allocations in range 2-16: 118.967
allocations in range 4-32: 124.392
allocations in range 8-64: 137.411
allocations in range 16-128: 163.914
allocations in range 32-256: 236.75
allocations in range 64-512: 300.082
allocations in range 128-1024: 432.996
allocations in range 256-2048: 699.238
allocations in range 512-4096: 1239.19
allocations in range 1024-8192: 2594.03
allocations in range 2048-16384: 5349.64
allocations in range 4096-32768: 11057.2
allocations in range 1-8: 122.341
allocations in range 2-16: 118.827
allocations in range 4-32: 124.806
allocations in range 8-64: 137.242
allocations in range 16-128: 163.031
allocations in range 32-256: 236.438
allocations in range 64-512: 301.368
allocations in range 128-1024: 432.473
allocations in range 256-2048: 698.872
allocations in range 512-4096: 1239.46
allocations in range 1024-8192: 2594
allocations in range 2048-16384: 5346.13
allocations in range 4096-32768: 11056.8
And benchmark is following:
/* Measure malloc and free running time.
Copyright (C) 2013 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#define TEST_MAIN
#define TEST_NAME "malloc"
#include "bench-string.h"
void
do_test (size_t size)
{
timing_t start, stop, cur;
const int iters = 1<<18;
unsigned int r_seed = 42;
void *ary[iters];
size_t sizes[iters];
size_t idx[iters];
for (int i = 0; i < iters; i++)
{
ary[i] = NULL;
sizes[i] = size + rand_r (&r_seed) % (7 * size);
idx[i] = rand_r (&r_seed) % (iters / 64);
}
printf ("\n allocations in range %lu-%lu:", size, 8 * size);
TIMING_NOW (start);
for (int i = 0; i < iters; ++i)
{
free (ary[idx[i]]);
ary[idx[i]] = calloc (sizes[i], 1);
}
for (int i = 0; i < iters; ++i)
free (ary[i]);
TIMING_NOW (stop);
TIMING_DIFF (cur, start, stop);
TIMING_PRINT_MEAN ((double) cur, (double) iters);
}
static int
test_main (void)
{
for (int j=0; j < 5; ++j)
for (int i = 1; i < 5000; i *= 2)
do_test (i);
return 0;
}
#include "../test-skeleton.c"