This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

stack_info::walk and alloca don't mix


Hi all,

FYI in case anyone else has been seeing strange crashes inside calls to api_fatal():

It seems that functions which use alloca() set up a non-standard stack frame which confuses both stack_info::walk and windbg. The former tends to either enter an infinite loop or end up executing code in la-la land; the latter crashes instantly. Worse, if an exception handler was active, it will detect the crash and attempt to generate a (second) stack dump, leading to an infinite loop until the stack space is exhausted and the process terminates.

The above seems to be the reason why fork failures often emit the "died waiting for longjmp" message instead of (or in addition to) "resource temporarily unavailable" -- the failed child enters an infinite loop trying to error-exit and the parent eventually times out.

The pernicious part is, gcc converts even normal stack allocations into alloca calls if they are "large," so just eliminating direct calls to alloca isn't enough. For example, dll_list::alloc declares a "WCHAR name[NT_MAX_PATH]" which gcc turns into a call to alloca() under the hood. The relevant assembler output is:

mov    $0x1002c,%eax
call   0x6115cde0 <_alloca>
mov    %edi,0x10024(%esp)
mov    0x10034(%esp),%edi
mov    %ebp,0x10028(%esp)
lea    0x1c(%esp),%ebp
mov    %ebx,0x1001c(%esp)
mov    %esi,0x10020(%esp)
movl   $0x10000,0x8(%esp)
mov    %ebp,0x4(%esp)
mov    %edi,(%esp)

mov    0x1001c(%esp),%ebx
mov    0x10020(%esp),%esi
mov    0x10024(%esp),%edi
mov    0x10028(%esp),%ebp
add    $0x1002c,%esp
ret

As nearly as I can tell, debug info would be required to recover the caller-saved %ebp from such a stack frame. Problem is, I don't know any way to identify such a stack frame short of using debug info, either.

An alternative might be to enable exceptions: even if no code actually throws exceptions, gcc emits unwind information which can be accessed quite easily using the definitions in lib_gcc's <unwind.h> (example below). The good thing is this info will be accurate unless the stack has been corrupted, but it's still far from ideal because it comes with all the space overheads that accompany exception handling. It also doesn't allow to recover function args, though I'm not sure that actually works in the presence of optimized code even today. That said, the unwind logic exits cleanly if it can't find any unwind info to use, so it might make a good debug-build option.

extern "C" _Unwind_Reason_Code trace_fcn(_Unwind_Context *ctx, void *d)
{
    int *depth = (int*)d;
    printf("\t0x%08x\n", _Unwind_GetIP(ctx));
    (*depth)++;
    return _URC_NO_REASON;
}

void print_backtrace_here()
{
    int depth = 0;
    printf("Stack trace:\n");
    _Unwind_Backtrace(&trace_fcn, &depth);
    fflush(stdout);
}


Thoughts? Ryan




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]