addr2line [ Was: better stackdumps ]

Christopher Faylor
Thu Mar 20 18:38:00 GMT 2008

On Thu, Mar 20, 2008 at 11:23:05AM -0700, Brian Dessent wrote:
>Corinna Vinschen wrote:
>> Is it a big problem to fix addr2line to deal with .dbg files?
>> I like your idea to add names to the stackdump especially because of
>> addr2line's brokenness.  But, actually, if addr2line would work with
>> .dbg files, there would be no reason to add this to the stackdump file.
>I absolutely agree that addr2line and/or dumper and/or gdb should be
>fixed, regardless of this patch.  I never meant to imply an either/or
>situation, and in fact I have debugged addr2line and here are the
>reasons it's broken:
>Firstly it's got nothing to do with .gnu_debuglink separate debug file,
>that part works just fine.  And secondly addr2line only loads the debug
>information for the module that you supply with -e, meaning that if you
>give "-e a.exe" it will look at symbols for a.exe, but it doesn't know
>that a.exe is dynamically linked to cygwin1.dll and it won't try to load
>symbols for cygwin1.dll.  This means to use it you need to know
>beforehand which module the address is in, which right there makes it
>kind of a pain to use for DLLs, and to me it rather dilutes the argument
>that you can just postprocess a stackdump file with it since you need
>more information than what's there.
>The next problem is that addr2line first tries to read STABS, and if
>that fails it falls back to DWARF-2.  I always build Cygwin and most
>other things with DWARF-2 debug symbols, mainly to make sure they work
>but really aren't we eventually hoping to get rid of STABS?  Anyway,
>this exposed another problem in that even if you build all of Newlib and
>Cygwin with -gdwarf-2 or -ggdb3, you still get a handful STABS symbols
>which are hardcoded in various assembler files:
>  asm (".stabs \"" msg "\",30,0,0,0\n\t" \
>  ".stabs \"_" #symbol "\",1,0,0,0\n");
>This is used to insert a linktime warning for using mktemp().
>sigfe.s:3:      .stabs  "_sigfe:F(0,1)",36,0,0,__sigfe
>sigfe.s:44:     .stabs  "_sigbe:F(0,1)",36,0,0,__sigbe
>sigfe.s:70:     .stabs  "sigreturn:F(0,1)",36,0,0,_sigreturn
>sigfe.s:108:    .stabs  "sigdelayed:F(0,1)",36,0,0,_sigdelayed
>This becomes a problem in that when bfd tries to find an address in the
>debug data it sees these minimal STABS and considers them a match --
>even though they are mostly irrelevant, they are present and since it's
>only got an address to go by it doesn't know that there is a much better
>match in the DWARF-2 data.  It just sees that it has gotten a (bad)
>match, so it doesn't bother looking in the DWARF-2 data.  And since
>those hand-coded .stabs above only give symbol name locations, not line
>number information, that means that regardless of what you ask addr2line
>it's going to return nothing because it only cares about line number
>I see two potential fixes here, the first being that Cygwin could be
>adapted to not hardcode .stabs but rather detect whether it's being
>built with DWARF-2 or STABS and use the appropriate kind.  The other fix
>is to teach BFD to try DWARF-2 first before STABS.  The attached patch
>does this, for the purposes of illustration -- I don't really claim this
>is correct.
>Once that is applied, here is the result of running the patched
>addr2line on the addresses in the stackdump of this testcase:
>$ for F in 610F74B1 610FDD3B 6110A310 610AA4A8 61006094; do
>/build/combined/binutils/.libs/addr2line.exe -e /bin/cygwin1.dll -f
>0x$F; done
>It now gets 3 out of 5 correct.  It got tripped up on _sigbe because
>again addr2line only cares about line number info, not general address
>information, and while there is information for the location of _sigbe,
>they don't contain line number info:
>(gdb) i ad _sigbe
>Symbol "_sigbe" is at 0x610aa4a8 in a file compiled without debugging.
>For the top frame (strlen), addr2line could not print anything because
>while there is location information, there is no line number
>(gdb) i li *0x610F74B1
>No line number information available for address 0x610f74b1 <strlen+17>
>This is due to the fact that strlen is implemented in newlib as
>libc/machine/i386/strlen.S which is a straight assembler version, and
>hence no line number debug records.
>*** To summarize thus far:
>1. addr2line can be made to work again by one of a) dictating the use of
>STABS (boo!), b) modifying Cygwin to not emit hardcoded .stabs
>directives directly, c) modifying BFD to prefer DWARF-2 to STABS when
>reading COFF files.
>2. addr2line requires the user to know beforehand which DLL a symbol is
>in, because it can't resolve runtime dependencies.
>3. addr2line only cares about line number debug records, which means it
>will be incapable of representing many symbols.
>4. As an implication of 3), addr2line is totally useless on DLLs/EXEs
>without debug information available.
>I think point number 4 is worth repeating: we as developers take for
>granted having debug symbols present, but most users do not have that
>luxury.  In that case addr2line becomes much less useful if it means
>first having to download the Cygwin source and build it, or know that
>it's possible to extract the .dbg file from the -src package.  (Although
>this won't work very well for use with gdb since the source files won't
>be present and even if they are the path to them in the .dbg file won't
>be correct.)  This all assumes that they even figured out that addr2line
>is part of binutils and installed that in the first place.
>And that is what I think makes this worthwhile, and worth putting in the
>DLL itself: the ability to get some useful info about the fault without
>requiring any developer tools or setup.  Even if the user has no idea
>how to use a debugger, they could still potentially paste the backtrace
>in an email to the list and list someone might be able to make sense of
>it.  How many times in the past has someone done this only to the
>response of "a plain stackdump file without symbols is useless as we
>don't know what those hex addresses correspond to in your particular set
>of DLLs"?  This would fix that.
>Christopher Faylor wrote:
>> There's still the issue of dealing with the separate signal stack.  That
>> makes stack dumps less than useful.
>Yes, it means there is one frame that says "sigbe" instead of the actual
>return location somewhere else.  I don't think that's impossible to fix
>either: the fault handler gets the context of the faulting thread so it
>can look up its tls area through %fs:4 and peek at the top of the signal
>stack for the value.  I will investigate if this is workable.
>But even if the return from signal wrapper frame is wrong, that doesn't
>make the output "less than useful".  Again, for reference, in the
>testcase it was:
>Now, given, it's not perfect but it's not significantly worse than the
>best that addr2line can muster, and infinitely easier for the user to
>generate (i.e. zero effort.)  You can still tell that something called
>printf with a bogus string.

When I said "less than useful", I was referring specifically to the
current situation not your patch.

Or, in other words, I was conceding your point.

>> The bottom line is that I think that rather than modifying cygwin to
>> work around the limitations of the tools we should be fixing the tools.
>I understand that desire completely.  It's just that I think we can have
>both "free but possibly unreliable" info in the stackdump and "complete
>and correct but requiring a developer environment" from the dedicated
>debug tools.

How is this different from what you'd normally see in linux?  A core
dump from linux isn't particularly useful if you don't have symbols, is

>> But, then, that puts the problem back on my shoulders as the gdb and
>> binutils maintainer.
>Don't worry, I have a keen interest in seeing all of this fixed so I
>will try to contribute time.
>Index: coffgen.c
>RCS file: /cvs/src/src/bfd/coffgen.c,v
>retrieving revision 1.65

That's not a patch that I can approve, unfortunately.


More information about the Cygwin-patches mailing list