This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
stack_info::walk and alloca don't mix
- From: Ryan Johnson <ryan dot johnson at cs dot utoronto dot ca>
- To: cygwin-developers at cygwin dot com
- Date: Tue, 03 May 2011 10:59:45 -0400
- Subject: stack_info::walk and alloca don't mix
Hi all,
FYI in case anyone else has been seeing strange crashes inside calls to
api_fatal():
It seems that functions which use alloca() set up a non-standard stack
frame which confuses both stack_info::walk and windbg. The former tends
to either enter an infinite loop or end up executing code in la-la land;
the latter crashes instantly. Worse, if an exception handler was active,
it will detect the crash and attempt to generate a (second) stack dump,
leading to an infinite loop until the stack space is exhausted and the
process terminates.
The above seems to be the reason why fork failures often emit the "died
waiting for longjmp" message instead of (or in addition to) "resource
temporarily unavailable" -- the failed child enters an infinite loop
trying to error-exit and the parent eventually times out.
The pernicious part is, gcc converts even normal stack allocations into
alloca calls if they are "large," so just eliminating direct calls to
alloca isn't enough. For example, dll_list::alloc declares a "WCHAR
name[NT_MAX_PATH]" which gcc turns into a call to alloca() under the
hood. The relevant assembler output is:
mov $0x1002c,%eax
call 0x6115cde0 <_alloca>
mov %edi,0x10024(%esp)
mov 0x10034(%esp),%edi
mov %ebp,0x10028(%esp)
lea 0x1c(%esp),%ebp
mov %ebx,0x1001c(%esp)
mov %esi,0x10020(%esp)
movl $0x10000,0x8(%esp)
mov %ebp,0x4(%esp)
mov %edi,(%esp)
mov 0x1001c(%esp),%ebx
mov 0x10020(%esp),%esi
mov 0x10024(%esp),%edi
mov 0x10028(%esp),%ebp
add $0x1002c,%esp
ret
As nearly as I can tell, debug info would be required to recover the
caller-saved %ebp from such a stack frame. Problem is, I don't know any
way to identify such a stack frame short of using debug info, either.
An alternative might be to enable exceptions: even if no code actually
throws exceptions, gcc emits unwind information which can be accessed
quite easily using the definitions in lib_gcc's <unwind.h> (example
below). The good thing is this info will be accurate unless the stack
has been corrupted, but it's still far from ideal because it comes with
all the space overheads that accompany exception handling. It also
doesn't allow to recover function args, though I'm not sure that
actually works in the presence of optimized code even today. That said,
the unwind logic exits cleanly if it can't find any unwind info to use,
so it might make a good debug-build option.
extern "C" _Unwind_Reason_Code trace_fcn(_Unwind_Context *ctx, void *d)
{
int *depth = (int*)d;
printf("\t0x%08x\n", _Unwind_GetIP(ctx));
(*depth)++;
return _URC_NO_REASON;
}
void print_backtrace_here()
{
int depth = 0;
printf("Stack trace:\n");
_Unwind_Backtrace(&trace_fcn, &depth);
fflush(stdout);
}
Thoughts?
Ryan