This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: suggestion for dictionary representation


On Sun, 22 Sep 2002 23:10:56 -0400, Daniel Jacobowitz <drow@mvista.com> said:
> On Sun, Sep 22, 2002 at 09:44:40PM -0500, Jim Blandy wrote:

>> It seems to me that the `skip list' data structure simultaneously
>> meets a lot of the criteria that we're currently meeting by having
>> multiple representations.

Interesting.

>> Skip lists:
>> - provide (probabalistic) log n access time
>> - are low overhead ("They can easily be configured to require an
>> average of 1 1/3 pointers per element (or even less).")
>> - are easy to build incrementally, and stay "balanced" automatically
>> - are obstack-friendly, since they don't involve realloc (as hash
>> tables do)

Well, the current use of hash tables knows the size of the hash table
when building it.  But of course that wouldn't be the case if we used
hash tables for global-like environments.

>> - are an ordered structure, which would support completion nicely (and,
>> by the way, make the `_Z' test for the C++ V3 ABI faster too)
>> - have a trivial iterator (walking the finest level of links)
>> - are pretty easy to understand
>> 
>> http://www.cs.umd.edu/~pugh points to a paper describing and analyzing
>> them.
>> 
>> Using skip lists, there'd be no need to distinguish `expandable' from
>> non-expandable blocks.

Personally, I like this: I'm not that big a fan of that distinction.
I can imagine situations where this could really help us when doing
namespace support, or support for its analogues in other programming
languages.

>> This one structure would scale to handle both local blocks and the
>> global environment (depending on how we handle lazy symbol reading
>> --- I'd like a more generic and descriptive term than "partial
>> symbol tables").

I'm not all that worried about having multiple implementations
depending on the needs of the situation.  But there's certainly no
need to go out of your way to have multiple implementations.

Also, when thinking about issues related to lazy symbol reading, I
think it's important to keep minimal symbols in mind, too.  Which, of
course, isn't really an example of lazy symbol reading, but it is
another sort of object that we currently are currently looking at in
lookup_symbol.

> Hmm.  Lots of simplicity/cleanliness benefits, but the real question
> as far as I'm concerned is whether the benefit to completion (the _Z
> thing is done as we read in symbols right now, so it's a complete
> non-issue) outweights going from O(1) to O(probabalistic log n) for
> symbol lookup.

> I suspect it would; having faster completion [I can't really see how
> to beat O(n) with the current hash tables, can anyone else?  But I
> think it's slower than O(n) right now; I recall it being
> quadratic...] would be nice.  O(~ log N) ought to be plenty fast,
> right?

Yeah, really.

I'm also curious about how it would affect the speed of reading in
symbols.  Right now, that should be O(n), where n is the number of
global symbols, right?  If we used expandable hash tables, then I
think it would be amortized O(n) and with the constant factor larger.
(But, I think, not larger in a way that would make a difference.)  I'm
curious about how often the "amortized" bit would lead to strange
hiccups, but I don't think that's a big deal.

But for skip lists, wouldn't it be something like O(n log n)?  If so,
that's an issue we have to consider.

With luck, I should be able to get the global symbol code into good
enough shape that can we can easily write the code both ways and then
just replace one call to dict_create_hashed_exandable with a call to
dict_create_skiplist, to test it.  We'll see.

>> The only remaining special case would be function blocks, in which
>> parameter symbols must to appear in the order they appear in the
>> source.  I think it's pretty ugly to abuse the name array this way; it
>> introduces special cases in dumb places.  This kludge could be removed
>> by changing the `function' member of `struct block' to a pointer to
>> something like this:
>> 
>> struct function_info {
>> struct symbol *sym;
>> int num_params;
>> struct symbol **params;
>> };
>> 
>> This would require extra space only for function blocks; non-function
>> blocks would remain the same size.  And this info would only be
>> consulted when we actually wanted to iterate over the parameters.
>> This would clean up a bunch of loops in GDB that currently have to
>> iterate over all the symbols in a function's block and do a switch on
>> each symbol's address class to find the arguments.  (And would this
>> also allow us to remove the arg/other distinction in enum
>> address_class?  Dunno.)
>> 
>> But if we were to remove function blocks as a special case, there
>> would only need to be a single structure for representing levels of
>> the namespace.

> I'm tempted to whack the block special case for function arguments.  It
> may make name lookup a little more complicated but I think it will make
> everything clearer.  We could, of course, try this on the branch and
> see if we like the results :)

Would it be reasonable to break up function blocks into two separate
blocks: a linear block that only defines the parameters for the
function and a non-linear block that contains the actual local
variables?  Not that I think Jim's scheme is a bad one - I agree that
it's better than the current scheme - but given the possibility of
local variables shadowing function parameters, it seems to me to be
conceptually cleaner to have two separate blocks appear anyways, and
it also solves this problem.

Also, for what it's worth, I'm still not ready to completely give up
on representing members of classes via a dictionary; that would
provide another place where a linear dictionary environment could be
useful.

David Carlton
carlton@math.stanford.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]