[ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.4

Ken Brown kbrown@cornell.edu
Mon Jul 6 13:33:00 GMT 2015


On 7/6/2015 9:15 AM, Ken Brown wrote:
> Hi Corinna,
>
> On 7/6/2015 6:01 AM, Corinna Vinschen wrote:
>> Hi Ken,
>>
>>
>> thanks for further testing this.
>>
>>
>> On Jul  5 22:15, Ken Brown wrote:
>>> On 7/5/2015 5:34 PM, Corinna Vinschen wrote:
>>>> This test release needs some good testing!
>>>
>>> I repeated the emacs experiment discussed in the "[ANNOUNCEMENT] TEST
>>> RELEASE: Cygwin 2.1.0-0.1" thread.  In the 32-bit case, the results were
>>> more-or-less the same as before: I forced a stack overflow, emacs recovered,
>>> I tried to continue working, there was a second SIGSEGV, and handle_sigsegv
>>> bailed out because garbage collection was in progress.  This time I was
>>> unable to prevent the second SIGSEGV by resetting max-specpdl-size and
>>> max-lisp-eval-depth.  I'm not sure what caused the second SIGSEGV, but it
>>> might have nothing to do with Cygwin.
>>>
>>> In the 64-bit case, however, the recovery from stack overflow never happened
>>> (i.e., the program never reached the siglongjmp).  Here's a gdb session:
>>> [...]
>>> 1647          if (!getrlimit (RLIMIT_STACK, &rlim))
>>> (gdb)
>>> 1656              beg = stack_bottom;
>>> (gdb)
>>> 1657              end = stack_bottom + stack_direction * rlim.rlim_cur;
>>> (gdb)
>>> 1658              if (beg > end)
>>> (gdb)
>>> 1660              addr = (char *) siginfo->si_addr;
>>> (gdb)
>>> 1663              if (beg < addr && addr < end
>>> (gdb) p beg
>>> $1 = 0x82ca27 ""
>>> (gdb) p addr
>>> $2 = 0x33ff8 ""
>>
>> I can't reproduce this.  It works fine for me.  For reference I attached
>> my simplified testcase again.   It's basically the emacs SIGSEGV setup,
>> main triggers the stack overflow, the handler tries to write a file for
>> testing if that works from the handler, then it siglongjmps.  The main
>> function tests if it can still fork, and then it repeats the action to
>> test if we're back to normal in terms of signal handling.
>>
>> If it works (and it does for me) the output looks like this:
>>
>>    $ ./sigalt
>>    command loop 1 before crash
>>    command loop 1 after crash
>>    In child
>>    In parent
>>    command loop 2 before crash
>>    command loop 2 after crash
>>    In child
>>    In parent
>>
>> On W8.1 for a standard GCC build of this testcase I get:
>>
>>    (gdb) p beg
>>    $1 = 0x40ac3 <error: Cannot access memory at address 0x40ac3>
>>    (gdb) p addr
>>    $2 = 0x43848 <error: Cannot access memory at address 0x43848>
>>    (gdb) p end
>>    $3 = 0x23cac3 ""
>>    (gdb) p/x rlim.rlim_cur
>>    $5 = 0x1fc000
>>
>> Check default stacksize:
>>
>>    )$ peflags -x ./sigalt
>>    ./sigalt: stack reserve size      : 2097152 (0x200000) bytes
>>
>>    0x200000 - dead zone 4K - default W8.1 64 bit guardpagesize 3 * 4K ==
>>    0x1fc000, the value rlim.rlim_cur returns.  Looks good to me.
>>
>> On W8.1 32 bit under WOW:
>>
>>    (gdb) p beg
>>    $1 = 0x8fc33 ""
>>    (gdb) p addr
>>    $2 = 0x92d5c <error: Cannot access memory at address 0x92d5c>
>>    (gdb) p end
>>    $3 = 0x28cc33 ""
>>    (gdb) p/x rlim.rlim_cur
>>    $4 = 0x1fd000
>>
>>    $ peflags -x ./sigalt
>>    ./sigalt: stack reserve size      : 2097152 (0x200000) bytes
>>
>>    0x200000 - dead zone 4K - default W8.1 32 bit guardpagesize 2 * 4K ==
>>    0x1fd000.
>>
>> On W7 32 bit native:
>>
>> (gdb) p beg
>> $1 = 0x2ec43 "\376\356..."
>> (gdb) p addr
>> $2 = 0x32d6c ""
>> (gdb) p end
>> $3 = 0x22cc43 ""
>> (gdb) p rlim.rlim_cur
>> $4 = 2088960
>> (gdb) p/x rlim.rlim_cur
>> $5 = 0x1fe000
>>
>>    $ peflags -x ./sigalt
>>    ./sigalt: stack reserve size      : 2097152 (0x200000) bytes
>>
>>    0x200000 - dead zone 4K - default W7 32 bit guardpagesize 1 * 4K ==
>>    0x1fe000.
>>
>>> Note that addr < beg, so we never reach the siglongjmp.
>>
>> I have no explanation for this.  What OS?  What does rlim_cur contain?
>> What does peflags -x print for this executable?
>
> I'm on W7 64-bit.  The problem seems to be that rlim_cur is too big.
>
> $ peflags -x ./emacs
> ./emacs: stack reserve size      : 8388608 (0x800000) bytes
>
> (gdb) p beg
> $3 = 0x82ca27 ""
> (gdb) p/x rlim.rlim_cur
> $2 = 0x850e80
>
> So there's overflow when end is computed:
>
> (gdb) p end
> $4 = 0xfffffffffffdbba7 <error: Cannot access memory at address 0xfffffffffffdbba7>
>
> This doesn't happen when I run your testcase with the same 8MB stack size:
>
> $ peflags -x0x800000 ./sigalt.exe
> ./sigalt.exe: stack reserve size      : 8388608 (0x800000) bytes
>
> (gdb) p beg
> $1 = 0x82cabb ""
> (gdb) p/x rlim.rlim_cur
> $2 = 0x7fd000
> (gdb) p end
> $3 = 0x2fabb
>
>> And last but not least, what is emacs doing there?  The stack should be
>> pretty much in a good shape when it's back to the main loop.  The stack
>> is fully commited and has the default number of guardpages at the bottom,
>> as it is just short of the stack overflow.
>>
>> For debugging purposes I also added a global variable called "tib" and a
>> memory info struct called "m" to the testcase which are initialized
>> right at the start of main.  tib points to the start of the TEB (Thread
>> Environment Block, a Windows per-thread bookkeeping structure) of the
>> main thread.  If you expand it right after it's fetched, you get
>> something along these lines:
>>
>>    (gdb) p *tib
>>    $2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x20c000,
>>      SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
>>      ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
>>
>> Note the values of StackBase and StackLimit and compare with your beg and
>> end values.  StackBase is the upper limit of the stack.  It grows downward
>> from there.  StackLimit is the lowest address as yet commited.  It's not much
>> yet as you can see, 0x230000-0x20c000 == 0x24000 == 144K.  Since Cygwin
>> executables have a default stack of 2 Megs, the allocation base of the stack
>> is probably at 0x30000.  This can be checked by looking at m:
>>
>>    (gdb) p m
>>    $1 = {BaseAddress = 0x22c000, AllocationBase = 0x30000, AllocationProtect = 4,
>>      RegionSize = 16384, State = 4096, Protect = 4, Type = 131072}
>>
>> See the value of AllocationBase.
>>
>> When you hit the breakpoint in handle_sigsegv, the output of tib should
>> look like this:
>>
>>    (gdb) p *tib
>>    $2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x32000,
>>      SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
>>      ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
>>
>> Observe the value of StackLimit.  For this output I ran the testcase on
>> W7 32 bit.  It has a default guardpage of 4K.  The new wrapper I wrote
>> in Cygwin restored the stack to its state rifght before the stack overflow
>> occured:
>>
>>    - At 0x30000 we have the 4K dead zone, which is always only reserved,
>>      never commited.
>>
>>    - At 0x31000 the 4K guard page starts.
>>
>>    - Thus the StackLimit (the start of the commited region of the stack)
>>      starts at 0x32000.
>>
>> You can utilize tib and m for testing in emacs as well.  Just do this:
>>
>>    #include <windows.h>
>>
>>    NT_TIB *tib;
>>    MEMORY_BASIC_INFORMATION m;
>>
>>    [...]
>>
>>    in main:
>>
>>    /* Record (approximately) where the stack begins.  */
>>    stack_bottom = &stack_bottom_variable;
>>    tib = (NT_TIB *) __readfsdword(PcTeb);
>>    VirtualQuery (stack_bottom, &m, sizeof m);
>
> I'll try this next and report back.

PcTeb seems to be defined only on x86.  What should I do on x86_64?

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list