This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: linker debug info editing

From: Daniel Berlin <dberlin at dberlin dot org>
To: Jim Blandy <jimb at red-bean dot com>
Cc: binutils at sourceware dot org, gdb-patches at sourceware dot org
Date: Sun, 12 Mar 2006 21:44:27 -0500
Subject: Re: linker debug info editing
References: <20060310124921.GN6777@bubble.grove.modra.org> <8f2776cb0603101744w3dd59741s4ad8e17b7069a6fa@mail.gmail.com>

On Fri, 2006-03-10 at 17:44 -0800, Jim Blandy wrote:
> After you've chosen dies to delete, how do you deal with other dies
> that refer to the deleted dies?  I'm not talking about parents; I'm
> talking about attributes whose form is DW_FORM_ref*.

The only correct answer to this is "rewrite all the references all
starting from scratch" :P

You could track it, but the gap tracking you'd have to do is pretty
annoying.  I had to do this once for a dwarf2 duplicate die eliminator.

SGI's linker eliminated duplicate dies at link time, IIRC.

> .
> 
> I think the information we need to do this reduction correctly isn't
> available at the level you're working at.  linkonce sections aren't
> really deleted; they're unified.  The data in them doesn't go away;
> equivalent data from elsewhere is used instead.
> 
> I tend to think that having the compiler divide the information into
> separate compilation units, as Jim suggests, is the only way to go
> here.  In that scenario, inter-CU references will use symbols to refer
> to their targets; after choosing which instance of the linkonce
> section to keep, you should still have definitions for all the symbols
> the other dies' relocs refer to.
> 
> As Daniel says, the GDB-related reasons for avoiding this solution are
> long gone.
> 

The problem with the inter-CU references and section splitting scheme
(IE -feliminate-dwarf2-dups) is that it has some greater constant
overhead compared to straight elimination because ref_addr forms are
have larger values, plus the different number of sections.  When you
have 80 meg of debug info, referencing with the absolute offset from the
beginning of .debug_info ends up being 4 bytes, while otherwise it would
have been 1 for an in-cu reference. 

This adds up quite quickly.  For a lot of files, we lost >8-10% of space
savings due to overhead.

In cases where you have < 10 meg of debug info, it sometimes even lost
out to not eliminating duplicates at all (even though there were, in
fact, lots of duplicates).

Also, deciding what to put into the split sections is hard.  You can't
just split every type and program into a separate CU, and ref_addr
everything.  The overhead of doing so is enormous.

I spent a large amount of time when we were implementing
-feliminate-dwarf2-dups measuring the cost of various schemes for
deciding what to try to split and what not.

I came to the conclusion that splitting sections should only really be
used if you can't have something that just goes through and eliminates
all duplicates by understanding and rewriting the dwarf2 info all at
once at link time.

Follow-Ups:
- Re: linker debug info editing
  - From: Alan Modra
- Re: linker debug info editing
  - From: Daniel Berlin

References:
- linker debug info editing
  - From: Alan Modra
- Re: linker debug info editing
  - From: Jim Blandy

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]