Windows heaps and Cygwin heap

Corinna Vinschen corinna-cygwin@cygwin.com
Fri May 13 08:37:00 GMT 2011


I'm going to squeeze my rambling in between this thread since
it's related.  I just changed the subject.

On Apr 19 14:16, Ryan Johnson wrote:
> Regardless of file mapping behavior, though, I don't see right off
> how to make this problem go away. Nothing stops thread stacks or
> heaps from causing problems with other dlls, and they seem to move
> around even when they could have stayed put.

I noticed this as well on W7 32 bit.  Even the process default heap
(what's now called heap 0) is moved around wildly.

There are noticable differences to Windows XP.  On XP, the address
ranges from 0x10000 to 0x230000 are neither heaps, nor is any part of
them shareable.

More importantly, the heaps on XP are not so versatile.  In contrast to
W7, if I start the same application multiple times, the heaps are always
in the same spot.  Apparently some Windows DLLs create their own heaps.
The more Windows DLL are loaded in a process, the more heaps I see.  On
XP, all of them are always in the same place.  On W7 the heap addresses
are plain erratic.

Short break for the commercials:  We can safely make the assumption that
the differences have been introduced with Windows Vista.  So, if I said
W7 it's actually NT 6.x, in contrast to the older NT 5.x kernels.  Ok,
back to the show.

What's more, if a non-forked Cygwin process starts, it creates a heap,
too.  This heap is not created using HeapCreate, but by simply calling
VirtualAlloc (NULL, ...).  This heap has a high probability to be at
some place where one of Win32 heaps are after a fork.

Now that we know what happens, my comment in heap.cc from a few years
back makes more sense.  Note that I didn't understand what happens at
that time, but I found a quirky workaround, the heap_slop:

  /* For some obscure reason Vista and 2003 sometimes reserve space after
     calls to CreateProcess overlapping the spot where the heap has been
     allocated.  This apparently spoils fork.  The behaviour looks quite
     arbitrary.  Experiments on Vista show a memory size of 0x37e000 or
     0x1fd000 overlapping the usual heap by at most 0x1ed000.  So what
     we do here is to allocate the heap with an extra slop of (by default)
     0x400000 and set the appropriate pointers to the start of the heap
     area + slop.  A forking child then creates its heap at the new start
     address and without the slop factor.  Since this is not entirely
     foolproof we add a registry setting "heap_slop_in_mb" so the slop
     factor can be influenced by the user if the need arises. */

So we now know that we are actually observing a part of the ASLR
strategy of NT6.  Heap addresses are always randomized.  Yes, there is a
PE flag which controls ASLR on a per-executable basis, but unfortunately
this only influences the usage of ASLR for the executable image itself,
as well as the thread stacks.  There's no way at all to disable heap ASLR.

Bummer.

However, I think there is some light on the horizon.  For some reason
there appears to be a rule where the heaps are allocated.  The address
of the heaps seem to be under 0x20000000, unless all the memory under
0x20000000 is already full.  I observed the address layout of many
Cygwin processes from cat over mintty to emacs, and the heaps are always
in the range below 0x20000000.

Therefore my idea.  Wouldn't it make sense to define the area below
0x20000000 as a "no-go" area, left entirely for the digestion of the OS?

Instead of allocating the heap at a random address, we start always at
0x20000000.  If that fails, we search for the next free address after
that, using VirtualQuery.

Does that sounds reasonable?  I created a matching patch, which I'd like
you to comment upon.  It even has comments!  I tested it by configuring
and building a few projects.  I'm still testing.  It would be nice if
others would test this change as well.  Here we go:


	* heap.cc (heap_init): Rewrite initial heap allocation to
	use addresses beyond 0x20000000.  Explain why and how.


Index: heap.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/heap.cc,v
retrieving revision 1.57
diff -u -p -r1.57 heap.cc
--- heap.cc	10 May 2011 15:39:02 -0000	1.57
+++ heap.cc	13 May 2011 08:29:24 -0000
@@ -17,6 +17,7 @@ details. */
 #include "dtable.h"
 #include "cygheap.h"
 #include "child_info.h"
+#include <sys/param.h>
 
 #define assert(x)
 
@@ -35,36 +36,80 @@ heap_init ()
   page_const = wincap.page_size ();
   if (!cygheap->user_heap.base)
     {
+      /* Starting with Vista, Windows performs heap ASLR.  This spoils
+	 the entire region below 0x20000000 for us, because that region
+	 is used by Windows to randomize heap addresses.  Therefore we
+	 put our heap into a safe region starting at 0x20000000.  This
+	 should work right from the start in 99% of the cases.  But,
+	 there's always a but.  Read on... */
+      uintptr_t start_address = 0x20000000L;
+      uintptr_t largest_found = 0;
+      size_t largest_found_size = 0;
+      SIZE_T ret;
+      MEMORY_BASIC_INFORMATION mbi;
+
       cygheap->user_heap.chunk = cygwin_shared->heap_chunk_size ();
-      /* For some obscure reason Vista and 2003 sometimes reserve space after
-	 calls to CreateProcess overlapping the spot where the heap has been
-	 allocated.  This apparently spoils fork.  The behaviour looks quite
-	 arbitrary.  Experiments on Vista show a memory size of 0x37e000 or
-	 0x1fd000 overlapping the usual heap by at most 0x1ed000.  So what
-	 we do here is to allocate the heap with an extra slop of (by default)
-	 0x400000 and set the appropriate pointers to the start of the heap
-	 area + slop.  A forking child then creates its heap at the new start
-	 address and without the slop factor.  Since this is not entirely
-	 foolproof we add a registry setting "heap_slop_in_mb" so the slop
-	 factor can be influenced by the user if the need arises. */
-      cygheap->user_heap.slop = cygwin_shared->heap_slop_size ();
-      while (cygheap->user_heap.chunk >= MINHEAP_SIZE)
+      do
 	{
-	  /* Initialize page mask and default heap size.  Preallocate a heap
-	   * to assure contiguous memory.  */
-	  cygheap->user_heap.base =
-	    VirtualAlloc (NULL, cygheap->user_heap.chunk
-	    			+ cygheap->user_heap.slop,
-			  alloctype, PAGE_NOACCESS);
+	  cygheap->user_heap.base = VirtualAlloc ((LPVOID) start_address,
+						  cygheap->user_heap.chunk,
+						  alloctype, PAGE_NOACCESS);
 	  if (cygheap->user_heap.base)
 	    break;
-	  cygheap->user_heap.chunk -= 1 * 1024 * 1024;
+
+	  /* Ok, so we are at the 1% which didn't work with 0x20000000 out
+	     of the box.  What we do now is to search for the next free
+	     region which matches our desired heap size.  While doing that,
+	     we keep track of the largest region we found. */
+	  start_address += wincap.allocation_granularity ();
+	  while ((ret = VirtualQuery ((LPCVOID) start_address, &mbi,
+				      sizeof mbi)) != 0)
+	    {
+	      if (mbi.State == MEM_FREE)
+		{
+		  if (mbi.RegionSize >= cygheap->user_heap.chunk)
+		    break;
+		  if (mbi.RegionSize > largest_found_size)
+		    {
+		      largest_found = (uintptr_t) mbi.BaseAddress;
+		      largest_found_size = mbi.RegionSize;
+		    }
+		}
+	      /* Since VirtualAlloc only reserves at allocation granularity
+	         boundaries, we round up here, too.  Otherwise we might end
+		 up at a bogus page-aligned address. */
+	      start_address = roundup2 (start_address + mbi.RegionSize,
+					wincap.allocation_granularity ());
+	    }
+	  if (!ret)
+	    {
+	      /* In theory this should not happen.  But if it happens, we have
+		 collected the information about the largest available region
+		 in the above loop.  So, next we squeeze the heap into that
+		 region, unless it's smaller than the minimum size. */
+	      if (largest_found_size >= MINHEAP_SIZE)
+		{
+		  cygheap->user_heap.chunk = largest_found_size;
+		  cygheap->user_heap.base =
+			VirtualAlloc ((LPVOID) start_address,
+				      cygheap->user_heap.chunk,
+				      alloctype, PAGE_NOACCESS);
+		}
+	      /* Last resort (but actually we are probably broken anyway):
+		 Use the minimal heap size and let the system decide. */
+	      if (!cygheap->user_heap.base)
+		{
+		  cygheap->user_heap.chunk = MINHEAP_SIZE;
+		  cygheap->user_heap.base =
+			VirtualAlloc (NULL, cygheap->user_heap.chunk,
+				      alloctype, PAGE_NOACCESS);
+		}
+	    }
 	}
+      while (!cygheap->user_heap.base && ret);
       if (cygheap->user_heap.base == NULL)
-	api_fatal ("unable to allocate heap, heap_chunk_size %p, slop %p, %E",
-		   cygheap->user_heap.chunk, cygheap->user_heap.slop);
-      cygheap->user_heap.base = (void *) ((char *) cygheap->user_heap.base
-      						   + cygheap->user_heap.slop);
+	api_fatal ("unable to allocate heap, heap_chunk_size %p, %E",
+		   cygheap->user_heap.chunk);
       cygheap->user_heap.ptr = cygheap->user_heap.top = cygheap->user_heap.base;
       cygheap->user_heap.max = (char *) cygheap->user_heap.base
 			       + cygheap->user_heap.chunk;



Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat



More information about the Cygwin-developers mailing list