This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: hash?



I said:
>> Regarding the code, isn't it possible to simply cast a block of four
>> characters as a long? I realize that there can be word alignment
>> problems, but given a string pointer, that should not be an
>> issue. This way, four bytes/characters could be processed in one
>> read. 

And Greg responded:
GH> Yep, but the ease of getting the characters probably isn't as much the
GH> reason that it only uses a small number of characters than the desire
GH> to keep the total number of other operations down.

Right, but what I am suggesting/asking is about efficiency and not
hash function characteristics. The hash function, scm_strhash, loops
over the string index. Consider the code for strings/symbols longer
than 5:

      scm_sizet i = 5;
      unsigned long h = 264 % n;
      while (i--)
	h = ((h << 8) + ((unsigned) (scm_downcase (str[h % len])))) % n;
      return h;

Disregarding the downcasing and the semi-random character choices, the 
only work is in accessing a character, multiplying by 256 (shifting)
and taking the modulus. I am suggesting that you get essentially the
same hash function without the loop by doing

	return *((long *) str);

And how long are symbols usually? Picking five words instead of five
characters should not impose a big cache-miss penalty since the
current code samples from the whole string. 

I am not experienced in fine-tuning code, so I am not so sure about
the effects of my suggestions. But I am genuinly curious about what
people in-the-know have to say. 

	Lars

-- 
Lars Arvestad               Dept. of Numerical Analysis and Computing Science
                       Royal Institute of Technology (KTH), Stockholm, Sweden

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]