This is the mail archive of the
guile@sourceware.cygnus.com
mailing list for the Guile project.
Re: hash?
- To: Greg Harvey <Greg dot Harvey at thezone dot net>
- Subject: Re: hash?
- From: Lars Arvestad <arve at nada dot kth dot se>
- Date: 03 Jan 2000 09:59:03 +0100
- Cc: Miroslav Silovic <silovic at zesoi dot fer dot hr>, hanwen at cs dot uu dot nl, guile at sourceware dot cygnus dot com
- References: <14435.21234.279317.759162@dokkum.cs.uu.nl> <m3ogbggwev.fsf@behemoth.dethfart.org> <7eg0wrt3qe.fsf@zesoi.fer.hr> <yueu2l4zpwp.fsf@turing.nada.kth.se> <m31z84mn7a.fsf@behemoth.dethfart.org>
I said:
>> Regarding the code, isn't it possible to simply cast a block of four
>> characters as a long? I realize that there can be word alignment
>> problems, but given a string pointer, that should not be an
>> issue. This way, four bytes/characters could be processed in one
>> read.
And Greg responded:
GH> Yep, but the ease of getting the characters probably isn't as much the
GH> reason that it only uses a small number of characters than the desire
GH> to keep the total number of other operations down.
Right, but what I am suggesting/asking is about efficiency and not
hash function characteristics. The hash function, scm_strhash, loops
over the string index. Consider the code for strings/symbols longer
than 5:
scm_sizet i = 5;
unsigned long h = 264 % n;
while (i--)
h = ((h << 8) + ((unsigned) (scm_downcase (str[h % len])))) % n;
return h;
Disregarding the downcasing and the semi-random character choices, the
only work is in accessing a character, multiplying by 256 (shifting)
and taking the modulus. I am suggesting that you get essentially the
same hash function without the loop by doing
return *((long *) str);
And how long are symbols usually? Picking five words instead of five
characters should not impose a big cache-miss penalty since the
current code samples from the whole string.
I am not experienced in fine-tuning code, so I am not so sure about
the effects of my suggestions. But I am genuinly curious about what
people in-the-know have to say.
Lars
--
Lars Arvestad Dept. of Numerical Analysis and Computing Science
Royal Institute of Technology (KTH), Stockholm, Sweden