This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Make locale archive hash function architecture-independent


In addition to the three sources of architecture-dependence of binary
locale files previously noted, I found a fourth source in the course
of preparing cross-localedef changes for glibc: the hash function used
in locale archives accesses bytes of the key as "char", with this
doing sign-extension or zero-extension to uint32_t depending on
whether char is signed or unsigned.  (This is used to hash both locale
names, where this is unlikely to make a difference, and also, in
localedef only but not in libc at runtime, as a hash applied to md5
hashes of locale contents (I'm not sure why this two-level hash
process is used).)

This patch eliminates this architecture-dependence by using unsigned
char in this hash function.  (I don't believe this is going to make
any significant optimization difference even if some architecture is
more efficient with signed char; looking up locales is not a common
operation, compared to ELF symbol lookup where both the standard and
GNU hash functions use unsigned char.)

(This might affect the compatibility of old locale-archive files with
new localedef, but I believe binary locales must always match the
glibc version in use anyway.  Incompatibility with new glibc binaries,
as opposed to the localedef program, would only occur if the
locale-archive includes a locale with a non-ASCII name, as libc itself
doesn't do the hashing of md5 hashes of locale contents.  If we want a
NEWS entry for any incompatibility, it should probably be added at the
end of this patch series once we've worked out whether to change the
format on m68k to use 4-byte alignment in cases where the native
alignment of int32_t is currently used.)

Tested x86_64.

2013-09-11  Joseph Myers  <joseph@codesourcery.com>

	* locale/hashval.h (compute_hashval): Interpret bytes of key as
	unsigned char.

diff --git a/locale/hashval.h b/locale/hashval.h
index 737162f..c714ec6 100644
--- a/locale/hashval.h
+++ b/locale/hashval.h
@@ -37,7 +37,7 @@ compute_hashval (const void *key, size_t keylen)
   while (cnt < keylen)
     {
       hval = (hval << 9) | (hval >> (sizeof hval * CHAR_BIT - 9));
-      hval += (hashval_t) *(((char *) key) + cnt++);
+      hval += (hashval_t) *(((unsigned char *) key) + cnt++);
     }
   return hval != 0 ? hval : ~((hashval_t) 0);
 }

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]