libstdc++/16612, empty basic_strings can't live in shared memory

Wed Aug 4 11:18:00 GMT 2004

re: 16612 empty basic_strings can't live in shared memory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16612

I am sympathetic to this usage, and would like to accommodate it.  

At the same time, it would be tragic if such accomodation demolished
the performance of the regular string.  A lot rides on that fixed 
address -- particularly in 3.4 and later, which actually checks for
it specifically, and avoids incrementing or decrementing the refcount 
at that address.  Anything that makes _S_empty_rep() slower slows
down the default string constructor, and a lot of user code.

Note too that it is very common to use strings during programmed 
"static" initialization, so it has to be initialized by the linker, 
and not at some random point during that process.  (There is no 
control over the order in which such initializations run, in general, 
although some targets' linkers support special hacks.)

Now that nothing modifies the empty-string rep object, it would almost 
be possible to use NULL as the address.  The fly in the ointment is 
the recent extension (grrr!) requiring that s.operator[](s.length()) 
actually yield a zero.  (There's no requirement for c_str's result to 
match anything else (it might yield "(charT*)this"!), and data() could 
yield zero + N safely because it can't be dereferenced.)  Of course 
any sort of conditional check in operator[] could be a performance 
disaster, and length() and many other operations would need a 
conditional check too.  Still, it might well be a win, overall.

What else might we do?  We certainly ought to be able to 
partial-specialize on the default allocator to continue using the
purely static initialization.  But what about user allocators, 
including shared-memory allocators?  Using shared memory doesn't, by 
itself, mean you really want string operations to be slow.  More 
importantly, the shared memory region might not be ready to use at 
program startup time, or when the library is being loaded, so any 
kind of programmed static initialization (init order problems aside) 
is out.

That leaves something akin to Benjamin's approach -- although 
inlining all that probably would be a mistake.  It seems to me the 
comparison to zero need not involve a memory barrier -- once it's 
nonzero, it will never change again.  It appears that each process 
that maps the shared storage would leak its own empty string into it.  
A memory region shared among lots of transient processes would fill 
up quickly.  Copying some other process's empty string will involve 
a real atomic increment, because the address won't match its own -- 
which also means the originating process really can't safely clean up.  

In principle, we could document that we allocate some distinguished 
type, so that if the user controls the allocator (or can
partial-specialize for that case) and give all processes the same 
address, then only one instance leaks.  (Of course that address would
have to be stored in the shared-memory region, too, but that would be
the allocator's business.)

A more robust alternative might be to export a block of zeroes from 
libsupc++ specifically for this purpose; it could be shared among 
all string types (or at least among all basic_string<char,T,A> and
basic_string<wchar_t,T,A>).  Too, it should speed up regular string 
operations when libstdc++ is dynamic-linked, because it would avoid 
one stage of relocation (I think).  I assume here that libsupc++'s 
static storage would always appear at a fixed address.  Maybe I'm 
wrong.  (Maybe there's already a block of zeroes there that we can
share!)

That's all I can think of at this hour, but there ought to be other 
practical alternatives, too.  Let's not be hasty.

Nathan Myers
ncm-nospam@cantrip.org