hash<string> and third-party libraries

Daniel Kegel dank@kegel.com
Thu May 6 20:15:00 GMT 2004


libstdc++ includes handy extensions like hash_map and hash_set.
Although using them is inherently non-portable,
one might reasonably expect their use to be portable amongst all
people using the same version of libstdc++.   However, if you want to use
C++ strings as hash keys, you have to supply your own definition
of hash<string>, e.g.

namespace __gnu_cxx {
         template<> struct hash< std::string > {
                 size_t operator()( const std::string& x ) const
                 {
                         return hash< const char* >()( x.c_str() );
                 }
         };
}

because stdlibc++'s ext/hash_fun.h doesn't define it.

And therein lies a portability problem.  Looking around on the net,
I see that different programs implement that function differently
(e.g. some use the whole string, some just up to the first null).
If two different pieces of code that get linked into the same
executable try to use two different definitions of that function, the result is undefined
(if you're lucky, the linker will catch it, but you might also
just have a silent runtime incompatibility).

Let's say you buy a copy of a third-party shared library
which uses the hash_set extension (either internally or in the interface).
Even if the vendor carefully uses the same version of gcc to compile
the library before he ships it to you, you could end up with a broken program
because of this.

I guess the only safe solution is to avoid using extensions :-)
or failing that, define your own containers when you want to
use a string as a hash index.

Still, it seems like there ought to be a way to solve this problem
in a way that lets everyone use hash_set<string> etc. without
worrying about compatibility.

This came up for me today, and it's making me groan.

Is anyone else annoyed by this?



More information about the Libstdc++ mailing list