This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: avl-trees vs. hashes


> Sorry, both wrong.  The hash table operations appear to be O(1)
> as long as the number of entries (n) is less than the size
> you initially allocated for the table.

And also if the number of entries per bucket is O(1).

It's sort of inherent in the concept of hashing that if the
number of elements per bucket gets beyond "a few", then you
lose the benefits of hashing, and specifically the O(1)
property.

I thought that was obvious and implicit in the whole discussion.

> (A) use some kind of balanced tree for the buckets resulting
> in O(log n) behaviour.  (Or worse if you use a linear search.)

Which of course is utterly silly.  Why possible advantage could
that have over just using a balanced tree without hashing?
(I'm not implying you are proposing doing this.  However,
some people were ...)

> (B) Make a bigger table and re-hash.  If you double the size of
> the table on each re-hash, then to add O(2^k) entries you
> must touch the first entry O(k) times, again giving O(log n)
> amortised cost.

Bzzt.  The amortized (i.e. average) cost is O(1).  Suppose you
start with 1 element.  At the first doubling you copy 1 element,
the next time 2 elements, the next 4 elements, and so on.
So inserting 2^k element requires time proportional to
(1 + 2 + 4 + ... + 2^k) which is < 2^(k+1).  Time per element
(amortized) 2^(k+1)/2^k == 2, which is O(1).

> The hash table can help with the constants hidden in the big-O
> notation, but it does not improve asymtotic complexity.

Wrong, at least under the normal assumption that memory accesses
and arithmetic are O(1).

	--Per Bothner
Cygnus Solutions     bothner@cygnus.com     http://www.cygnus.com/~bothner