This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: DWARF2 problem with g++-3.0


Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Daniel Berlin <dan@cgsoftware.com> writes:
>> Jim, since your the maintainer, i'm sure you already know this, but
>> because of dwarf2read's inability to handle a lot of dwarf2 properly
>> currently, it doesn't work all that well with g++-3.0.
> 
> I know that there are a bunch of problems with GDB and the GCC V3 C++
> ABI.  I don't understand what they are yet, so this is definitely
> helpful.
> 
> Rewriting the dwarf2 reader is a big change.  I'd need to understand
> the current reader's problems more thoroughly than I do now before I
> could approve a rewrite.  Modulo the sarcasm, thanks for taking the
> time to explain these problems.

Sure. Sorry for the sarcasm. It's been a particularly long week for me
anyway (girlfriend had sudden change of flight plans, had to drive up
to newark on four hours sleep, etc).
I probably shouldn't be reading email.


The list of problems with the current dwarf2 reader, and why i
chose to rewrite it (it's more of a get rid of all the problems
enumerated here rewrite, than a start from scratch rewrite. But it
affects every procedure):

In order, with the later problems being caused by the earlier ones (IE
the fundamental problems).
It was based on the dwarf reader originally, which had a competely
different information representation format. 
It relies heavily on global state to attempt to deal with compilation
units properly.  It deals with this by never deallocating anything,
and storing pointers to the global state 
pointers in the psymtab_private stuff. So if those pointers never got
set properly (IE you tried to use an innocuous looking d2 routine
elsewhere, or even in some other part of the d2 reader), everything
comes crashing down.
Because it relies heavily on global state, and never deallocating
anything, we waste a *ton* of memory, because we can't free any of the
buffers we allocate (IE we can't just free the part of the line info
buffer that we just processed, since all of the buffer was read at
once, and allocated at once, for the entire debug_line section.  We
have to wait till we are done completely with all of the compilation
units to be fully processed, 
which means waiting until the psymtabs are *all* converted).
It can only process one compilation unit at a time, sequentially.
Once you start processing a compilation unit, you can't stop until you
are done, or else you have to restart again (the offsets and whatnot
we are currently at and the abbrevs table/die reference table is
global state too).  This is because of the global state.
This means we can't handle, among other things, DW_FORM_ref_addr
properly.  If it's an absolute offset from .debug_info, and doesn't
happen to be pointing to a die in our compilation unit, we get
screwed.
We can't just read the abbreviations table for that compilation unit,
and do a recursive call to the right procedure, because the
abbrevatations table, like everything else, is global.  If you read in
the abbrevatiations for the other cu, you've just blown away the ones
for the current CU.
DWARF2.1 now supports not only imported compilation units/one CU per
include file (it used to technically be frowned upon, but it's a good way to
do elimination), but compilation units can be in other files, and we
can reference them.
It also has discontiguous scopes, so we need to be able to pick up the
state at the right point in time.
We can't properly give names to scopes, we rely on
DW_AT_mips_linkage_name to save us. This is because, as you would
imagine, of our reliance on all of this global state.
We can't do smart elimination, like STABS does, because we have all
this global state, too.
As Jason Merrill confirmed for me, for at least C and C++, we can
eliminate any symbols with the same name, declared in the same line,
of the same file.
However, we can't do this, because the files and directories table are
local to the line decoding routine, and deleted afterwards (it just
decode all the line info for the CU at once, using a pointer to the
offset into the global buffer contianing all the line info for that
objfile).
Adding features like pubnames support, arange support, frame support,
range list support, etc, is complicated by all the above.
Even reducing the memory requirement (by each CU keeping it's own
portion of the various sections it needs, so they can be properly
freed after they aren't necessary, which is after we get a full
symtab.)
isn't possible without fixing the same basic problems.
To change any of this requires the same basic rewrite, which is what i
performed:

Change the global state to be CU specific state.

IE, rather than passing around abfd's and objfiles, and relying on
global state, we pass around, to the various procedures, like
read_enumeration, etc, a concise structure representing a compilation
unit we expect that routine to operate in the context of, and all of
the state for that specific CU.  This includes, that CU's portion of
the various buffers that were singular and global before. The offsets
in the file of those buffers (so we can reread them if we've thrown
them out).  The abfd that CU came from.  The abbrevs hash table. The
die reference table.  

Basically everything that was global, and was very very bad practice.

I've added routines to read a compilation unit at a given offset (IE
get it's dies and abbrevs, but not *do* anything with them) a hash
table of compilation units, a routine to get the compilation unit
instance for a CU at a given offset if it's in the hash table, and a
routine that gets the cui if it exists in the hashtable, and reads in
the compilation unit, and returns it's cui (compilation unit instance,
the structure used to pass around the state)
in otherwise.  Lastly, for this part of the changes, I added a routine
that will give you the cui for a die at a given absolute offset in a
file. This is necessary to support DW_form_ref_addr properly, since
the DIE it's referencing could be in another CU.
Most of the functional routines that actually do die processing and
whatnot haven't been changed much at all.  They simply don't rely on
the various pieces of global state anymore, they just get it from the
compilation unit.
There were some routines that needed some kind of significant
modification, but these routines are rather small.  They include the
routine to follow die references, which needed to make sure we got the
right cui for the die in question, and used *that* when trying to read
the die to return, rather than assuming it was always in the same
compilation unit.

I added routines to read the frame info if it exists, and a routine to
given a given pc, give you the dwarf frame info associated with that
PC.  I've verified it works, but it's obviously not used in anything
right now (well anything that isn't a local hack to use it to get
frame registers and whatnot, that i put in my tree to test it), and
won't be until we can override the routines to get saved registers and
whatnot with the dwarf ones.
Other routines added:
To read and store the pubnames tables. and the aranges.
To handle debug_loc properly (though it's currently #if 0'd out until
the symbol evaluation stuff is approved, since it's worthless until
then. I verified it works).

To handle .debug_macinfo, but do nothing with the info (until we
hookup cpplib, obviously).

We also free memory chunks we are done with them, and save a lot of
memory, since we only read the info for a CU when we need to process
*that* CU, and when we are done generating symbols for that CU, free
it up.
All the approriate reading routines have been written to handle the case where
the various parts of the info they need may not be present already in
the cui passed in.  IE: They assume nothing, but take what they can
get.

As I said, the changes are mainly global changes that represent, besides the
added routines, require small modifications to each routine.


However, the patch is 50k bigger than the actual file.

So that pretty much constitutes and entire rewrite.

I also fixed some of the bugs i noticed along the way, and most of the
FIXME's.

In short, i've made the routines not require global state that was
making performing modifications, optimizations, and general
maintenance, a pain in the ass. 
We used to pass around all kinds of crap to start to get around these
problems.  Kevin, for instance, needed to start passing the cu_header
to various routines to be able to handle 64 bit dwarf2. (No offense
meant to Kevin here, but it is ignoring the larger problem that caused
this to be necessary in the first place). No longer is this necessary
of course, because we pass the CUI to these routines already, and the
CUI has the header for that CU in it.

It's not something you could do incrementally, because it changes the
basic design decision of using global state, for a debug format that
is specified in terms of local units.

> 
> 
>> Compile misc.cc, from the gdb.c++ testsuite, with dwarf-2 info on,
>> using gcc-3.0.
>> 
>> (%:/buildspace/dwarf2gdbtree/src/gdb/testsuite/gdb.c++)- nonworkinggdb ./a.out
>> GNU gdb 2001-06-12-cvs
>> Copyright 2001 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "powerpc-unknown-linux-gnu"...
>> (gdb) p vB
>> $1 = {void (void **)} 0x100013ac <vB>
>> (gdb) ptype vB
>> type = void (void **)
>> (gdb)
>> 
>> This is completely wrong, it's assigned the name vB to the constructor
>> instead of vB::vB.
>> and ptype vB, which is the name of the class, gives you the type of
>> the constructor instead.
>> Bad.
>> 
>> It should be doing this:
>> 
>> (%:/buildspace/dwarf2gdbtree/src/gdb/testsuite/gdb.c++)- ../../gdb ./a.out
>> GNU gdb 2001-06-12-cvs
>> Copyright 2001 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "powerpc-unknown-linux-gnu"...
>>  
>> (gdb) ptype vB
>> type = class vB : public virtual vA {
>>   public:
>>     int vb;
>>     int vx;
>>  
>>     vB & operator=(vB const&);
>>     vB(int, const void **, class vB &);
>>     vB(int, const void **);
>> }
>> (gdb) p vB
>> Attempt to use type name as an expression
>> 
>> The problem, as i'm sure you know, is that we go to lookup the
>> specification die for the clone constructor (which is an abstract entity at
>> the subprogram level, not a class member. Which is correct), and take
>> an attribute from it. In this case, the attribute is the name.
>> However, this won't work for names, because the name on the
>> specification die, which is a class member, isn't a fully qualified
>> name.  It can't be fully qualified, because it doesn't really
>> exist. It's a specification for the two out-of-class constructors that
>> are the real constructors.
>> So, since we fail to notice that the specification, it's really inside
>> another scope, and thus, we end up naming it "vB", instead of
>> "vB::vB".  The real constructors have no DW_AT_mips_linkage attribute,
>> because DW_AT_mips_linkage is a hack, not necessary, and we're trying
>> to make it go away anyway. 
>> 
>> I'd help out here, and provide a patch that rewrites much of the
>> dwarf2 reader because of this and other problems, but i can't make
>> patches that work, and i'm sure, since you are the dwarf2 maintainer
>> after all, you took care of this problem, and have a patch ready for
>> it.  And i'm sure it won't be a bad hack, like tracking the nesting
>> level of each die in an array, ignoring partial symbols that have a
>> specification in a deeper nesting level, and then getting the real
>> name for the other constructor dies (that have an abstract origin,
>> instead of a specification, but suffer the same fate anyway), from the
>> pubnames table.  Cause that would be a pretty horrific hack, and
>> depend on the behavior of g++ outputting fully qualified pubnames (Not
>> that the dwarf2 reader works all that well with other compilers, due
>> to it's failure to support ref_addr properly, something else i fixed,
>> but am sure you have patches just around the corner for).  Nor would I
>> think you would attempt to perform the hack of adding a prefix
>> argument to the read_partial_die, so we can try to pass along the
>> current scope's prefix, which wouldn't work anyway (you'd need to
>> store it in the partial die info).  Nope. I'm sure you've already
>> solved this problem the right way, no?
>> 
>> I guess I'll submit the rewrite anyway, even though i know it's not
>> necessary, since as I said, i've got faith you've got this problem
>> under control already.
>> 
>> It is, after all, a pretty important bug, since it's nasty in the way
>> it affects people, being that once things get into the symbol table
>> (IE the psymtab gets converted) the lookups work since they find the
>> real symbol first.  Of course, this means you won't really notice it
>> until you have programs consisting of more than one file, or shared
>> libraries, where just running the program doesn't cause the psymtab to
>> be converted because that's where the symbol named "main" is.  
>> 
>> Cause then you just get funny behavior like:
>> (gdb) info func vB
>> All functions matching regular expression "vB":
>>  
>> File misc.cc:
>> void vB(void **);
>> void vB(vB *, void **);
>> (gdb) b vB
>> "vB" is not a function
>> (gdb)
>> Which, as your quote today said, is "Amazingly silly".
>> Notice we haven't actually fixed the problem here, just papered over
>> it and gotten lucky.
>> 
>> Normally, whether vB is a type or not will depend on whether you've
>> looked up symbols in the file it's in or not.  The kind of context
>> dependent bug that you really don't want to have.
>> 
>> Then there's another issue of the dwarf2 reader saying "if
>> (die->has_children && ! die_is_declaration (die))
>> {
>> ...
>> }
>> else
>> {
>>         /* No children */
>> }
>> "
>> which is obviously wrong (it may have children, and be a declaration,
>> as die_is_declaration defines it, because die_is_declaration isn't
>> quite right), and exposed by libstdc++-v3, resulting in stubs we
>> shouldn't have.  The fix, is of course, to remove die_is_declaration
>> entirely there.
>> Non-defining declarations have no children. What would they be? If we
>> have one that does, we should do *something with it*, not just ignore
>> it entirely.
>> This also makes the comment right. You knew all of this of course, so
>> i don't know why i'm telling you.
>> In fact, i'm sure you know all about the rest of the issues, and have
>> them taken care of on the dwarf2 side of things.  So i'm only really
>> submitting the rewrite for giggles.  Not that my code is worth much more.
>> 
>> Since the GCC 3.0 release is in 4 days, I expect you'll submit your
>> code soon.
>> 
>> HTH,
>> Dan
>> 
>> 
>> -- 
>> "I got up one morning and couldn't find my socks, so I called
>> Information.  She said, "Hello, Information."  I said, "I can't
>> find my socks."  She said, "They're behind the couch."  And they
>> were!
>> "-Steven Wright
>> 

-- 
"I can levitate birds.  No one cares.
"-Steven Wright


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]