This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: avl-trees vs. hashes


On Wed, 28 Oct 1998, Dirk Herrmann wrote:

> I think the best solution is to _not_ store the hash by default. Reason: The
> user could do so if he wishes. In a generic hashtable implementation, which
> allows to provide a hash-function and an equality predicate, you could do the
> following:
> 
> * Keys are pairs (hash . real-key)
> * The hash-function which is provided to make-hashtable is simply 'car.
> * The equality function first compares the cars, then performs the real
>   equality test on the real-keys:
>   (define (my-hash-equal? k1 k2)
>     (and (eq? (car k1) (car k2))
>          (real-equality-test (cdr k1) (cdr k2))))
> * Instead of (hashtable-set! table key value) you do 
>   (hashtable-set! table (cons (real-hasher key) key) value)
> * make-hashtable is called as follows (or similarly):
>   (make-hashtable car my-hash-equal?)
> 
> By _not_ adding the hash you have the possibility to get the best of all
> worlds.
> 

I think (hashtable-set! table (cons (real-hasher key) key) value) would
scare the hell out of a Scheme new-comer  ;)

I do plan on adding a "(make-hashtabx hasher-function equality-predicate)"
to my extension.  But of course, I'll still be storing the key's hash
value in the entry simply because it's such a big win for ref, set!, and
resize. 

Guile currently has

primitive: hashx-ref hasher assoc table key [default] 
primitive: hashx-set! hasher assoc table key value 
primitive: hashx-remove! hasher assoc table key 

If users really want a non-hash-storing, auto-resizing (or at least
auto-resizable) hash table, then one possibility would be to modify these
procedures so that "table" is a header-vector pair.  The procedures would
then do book keeping with the header, and use the header to decide when to
resize. 

scm_hash_fn_get_handle
scm_hash_fn_create_handle_x
scm_hash_fn_ref
scm_hash_fn_set_x
scm_hash_fn_remove_x

in hashtab.c need to be modified accordingly.  Perhaps the new procedures
could be called resize-hashq-set!, resize-hashv-remove!, etc.

Another possibility would be to rewrite my extension (I would just copy
hashtab-type.c to hashtab-no-store-hash.c and make the necessary
modifications).  Right now, I'm opposed to rewriting my extension to make
hash storing optional for two reasons:

1. it's faster to unwrap hash from

(hash (key . value))

than it is to unwrap hash from

((hash . key) . value)

The efficiency of the hash unwrapper effects all of ref/set!/del!  

2. And of course, in general it's faster to compare hash values than it is
to compare keys.


	Jay
	jglascoe@jay.giss.nasa.gov