This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [rfc] split up symtab.h


On Tue, 8 Oct 2002 13:54:46 -0700, Kevin Buettner <kevinb@redhat.com> said:

> Aside from the build time issue, are there other reasons why
> splitting up symtab.h is desirable?

Honestly, I have a hard time answering this question: it's easy enough
to consider practical considerations, but the philosophical side of
things isn't so clear to me.  I guess that, if you pinned me down, I'd
say that I start with a default assumption that an include file should
normally correspond to a single construct, which typically means a
single structure.  But I'm not sure I really believe that: I haven't
thought enough about its implications.

To look at it another way: why should 'struct minimal_symbol' and
'struct linetable' both be in the same header file?  The best answer
that I can come up with is:

* 'struct symtab' has a member that is a pointer to a 'struct
  linetable'.
* 'struct symtab' also stores a bunch of 'struct symbol's.
  (Indirectly: via 'struct blockvector' and then 'struct block'.)
* 'struct minimal_symbol' is kind of like 'struct symbol'.

That, to me, is not a very good reason for both of those structs to be
in the same header file.

> Here are several reasons for not splitting it:

> 1) The list of includes for many .c files will (I suspect) grow
>    quite a bit.  If it turns out that you'll be replacing one
>    #include statement with five or size (per source file), I
>    can't really see that making the split was an advantage.

I honestly don't know what the average number of includes that each
#include "symtab.h" would turn into is.  Five might be right, or it
might be just a tad high.  (It also depends on whether or not the
#include files for minimal_symbol, symbol, and partial_symbol are
allowed to include the one for general_symbol_info.)  I've generated
various correlations between uses of different structures (I knew my
number theory background would come in useful somehow!), but they don't
give me a clear answer to this.  One example is that minimal_symbol,
symtab, and symbol are the most commonly used structures in symtab.h,
but while 111 files refer to at least one of these structures, only 29
refer to all three.  (And it's clear that the conceptual link from
minimal_symbol to symtab passes via symbol: only 4 files refer to
minimal_symbol and symtab but _not_ to symbol, and only 8 files refer
only to symbol but not to either of minimal_symbol or symtab.  But
there are tons of files that mention either minimal_symbol or symtab
but not both.)

I'm certainly not looking forward to changing existing files.  Having
said that, I don't think it would impose a large future maintenance
burden: if somebody, say, adds a function to an existing file that
calls one of the new headers to have to be pulled in, the compiler
will let that person know, and it's easy enough to use grep to figure
out which file to include.

> 2) One could argue that modifying symtab.h *should* be a heavy weight
>    operation.  I.e, you're modifying something that's at the very
>    heart of gdb and you need to take great care.

This is, to me, a really important issue.  But I'm honestly not sure
whether this argues for or against breaking up symtab.h.  For example,
since symbol stuff is so important, I like to make small changes and
recompile GDB after each change (and even to run a subset of the test
suite after each change) to help reassure me that I didn't screw
anything up.  And having long compilation times really works against
me there: if it takes longer to recompile the program than to make the
change, then I'll wait until I've made several changes before
recompiling.

Or, to give another example, sometimes I make a change, and then after
using it for a little while, I realize that the change is a little
subtler than I first thought.  So I want to include a comment that
explains the situation a little better: but, just by including a
comment, I've doomed myself to a large recompilation.  (I certainly
wouldn't fix a typo in a comment in symtab.h, even if it's one that
I've made myself, unless I'm doing other changes to symtab.h: it's
just not worth the pain.)  Of course, one answer is to do the thinking
before making the change in the first place, and that's the best
situation: but, alas, I'm not a good enough programmer to always be
able to forsee the implications of my changes that way.  (Obviously I
should give changes time to settle before submitting them as an RFA,
but that's another matter entirely.)

So, basically, the long compilation times are making it more painful
to follow what seems to me like good software engineering practices.

> 3) Makefile.in maintenance becomes harder due to the larger
>    number of header files.

I guess; I don't have a lot of experience with that.  I suppose the
problem there is that, if Makefile.in gets out of sync, then you might
not recompile when you're supposed to, and, unlike your first point,
this could lead to problems that programmers were unaware of.  That
would be very bad.  Too bad there's no way to generate those
dependencies automatically...

> I should note that I don't find any of the above reasons to be
> overly compelling.  I just think that we need a better reason for
> making such a split than the build time consideration.

Yeah, I totally agree.  I think that there should be some sort of
philosophical justification for when to split a header file, and I
certainly wouldn't argue with you when you say that changes to core
structures in symtab.h shouldn't be made lightly.

David Carlton
carlton@math.stanford.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]