This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: calloc speed difference


On 1/12/18, Christian Franke  wrote:
> Lee wrote:
>> Why is the cygwin gcc calloc so much slower than the
>> i686-w64-mingw32-gcc calloc?
>>    1:12 vs 0:11
>>
>> $cat calloc-test.c
>> #include <stdio.h>
>> #include <stdlib.h>
>> #define ALLOCATION_SIZE (100 * 1024 * 1024)
>> int main (int argc, char *argv[]) {
>>      for (int i = 0; i < 10000; i++) {
>>          void *temp = calloc(ALLOCATION_SIZE, 1);
>>          if ( temp == NULL ) {
>>             printf("drat! calloc returned NULL\n");
>>             return 1;
>>          }
>>          free(temp);
>>      }
>>      return 0;
>> }
>>
>
> Could reproduce the difference on an older i7-2600K machine:
>
> Cygwin: ~20s
> MinGW: ~4s
>
> Timing [cm]alloc() calls without actually using the allocated memory
> might produce misleading results due to lazy page allocation and/or
> zero-filling.
>
> MinGW binaries use calloc() from msvcrt.dll. This calloc() does not call
> malloc() and then memset(). It directly calls:
>
>    mem = HeapAlloc(_crtheap, HEAP_ZERO_MEMORY, size);
>
> which possibly only reserves allocate-and-zero-fill-on-demand pages for
> later.

Which seems like it could be viewed as a feature?  Sort of like buying
on credit - you don't pay for it all up front, just pay a bit each
time you reference another zero fill on demand page.


> Cygwin's calloc() is different.
>
> This variant of the above code adds one write access to each 4KiB page
> (guarded by "volatile" to prevent dead assignment optimization):
>
> #include <stdio.h>
> #include <stdlib.h>
> #define ALLOCATION_SIZE (100 * 1024 * 1024)
> int main (int argc, char *argv[]) {
>      for (int i = 0; i < 1000; i++) {
>          void *temp = calloc(ALLOCATION_SIZE, 1);
>          if ( temp == NULL ) {
>             printf("drat! calloc returned NULL\n");
>             return 1;
>          }
>          for (int j = 0; j < ALLOCATION_SIZE; j += 4096)
>            ((volatile char *)temp)[j] = (char)i;
>          free(temp);
>      }
>      return 0;
> }
>
> Results:
>
> Cygwin: ~310s
> MinGW: ~210s

Wow!  Really nice explanation & example - Thank you.
Lee

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]