This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
RE: DT_GNU_HASH latest patches
- From: "djamel anonymous" <djam8193ah at hotmail dot com>
- To: jakub at redhat dot com, binutils at sources dot redhat dot com
- Cc: drepper at redhat dot com, michael dot meeks at novell dot com, libc-alpha at sources dot redhat dot com
- Date: Thu, 06 Jul 2006 19:53:29 +0000
- Subject: RE: DT_GNU_HASH latest patches
- Bcc:
Hello, i am writing you this time about the first variant.after looking at
the benchmark results i noted that there have been a reduction in the number
of l1 cache misses; a reduction in l1 cache misses means a win of 12 cycles
; the difference between the latency of l2 cache and that of l1 cache
15-3.on the other hand replacing a division by a binary and & is a win of at
least 25 cycles, so it think that avoiding tthe division in the common case
may improve performance.
changing the size of the main hash table is not an ideal solution because it
is difficult to choose a good power of 2 size and there is no gurantee that
a power of 2 size will give short chains.this also will introduce a great
variability in the size nbcukets, they may be oversized which may give a big
waste of space.on the other hand the solution of a filter before the main
hash table that is a power of 2 has several advantages:
1-it is small: on average its size is nsymbols bytes, which has more chance
to fit in the l1 cache than 3*nsymols or 4*nsymbols bytes for tha current
hash table.
2-because it is small oversizing or undersizeing will not have a big impact
on the overall space utilization.
3-the benchmark of take3 shows an increase in l2 cache misses because of the
enlarged size of the hash table; the reduced size of the filter hash table
may do less l2 cache misses.
4-in the take3 hash table there is a waste of space because the symbol index
word inside the hash table is useless. according to the charts of yesterday
there is between 5% and 12% empty buckets inside each hash table.for each
empty bucket the symbol index will have no role and will be wasted. with the
filter hash table every added word will participate in reducing the
probability to go to the second hash table.
when an access is done to the filter hash table and the bit is found to be
set, there will be an access to the normal hash table.there is still a
chance to avoid a third memory access if the bucket is found empty(between
5% and 12 %) so the average number of memeory accesses may not increase too
much compared to take3.
I think that nbucket must not be a power of two, otherwise every access to
the normal hash table will access a non empty bucket and third memory access
to the hash chains will be done.
the following is a function that computes the nearest power of 2 to a given
integer number.
static inline unsigned pow2_round(unsigned int input)
{
unsigned log2=32-__builtin_clz(input);
if ((1<<(log2-2))&(input))
return 1<<log2;
else
return 1<<(log2-1);
};
the corresponding size of the filter hash table would be
8*pow2_round(nsymbols) bits or pow2_round(nsymbols)/4 32 bits integers.
best regards.
_________________________________________________________________
Testez Windows Llive Mail Beta !
http://www.msn.fr/newhotmail/Default.asp?Ath=f