This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ELF linking question related to symbol collisions


On 11/21/2013 10:14 PM, Carlos O'Donell wrote:

No, that's not the way it works. You must manage the global namespace
or collisions will lead to incorrect runtime behaviour.

Oh well, thanks for confirming my suspicions.

So we have several desktop applications that have an ambiguous
reference to json_object_get_type, via the pulseaudio library.

Yes, and that is a serious problem.

It turns out that there are further collisions between json-c and jansson, on the json_object_get and json_object_iter_next symbol, but we haven't binaries in Fedora that trigger it (based on the static DT_NEEDED information).

The problem is mainly that they fall afoul of the ELF rules for
interposition.

Right. But apart from the occasional LD_PRELOAD and the language interpreter optimization (which Fedora doesn't use—you compile your main Python/Perl/etc interpreter binary with a statically linked non-PIC copy of the actual interpreter implementation, exported dynamically, and your extension modules link against that instead of the interpreter DSO), I don't think there is actual use of that.

Or more to the point, I'm not sure if the current linking algorithm is what we need. Something like -B direct might be a better fit for our current needs.

The trouble with this is that's fairly difficult to detect. Static
analysis misses collisions introduced by dlopen and dlsym.

Right, in this case you need a special-purpose analysis tool to
catch this, something that models the dynamic linker and ELF.

I have a pretty good approximation, but exclusively based on data that is statically available (and some heuristics to avoid implementing rpath resolution or /etc/ld.so.conf.d/ handling). The amount of data is fairly large, so I haven't tried to detect actual symbol interposition.

Symbol collisions are only bad if both symbols do not implement the
same ABI and API. If they do implement the same ABI and API then it's
a replacement function that is safe to interpose.

Unless that function references static symbols, then it may or may not be safe. If everything ends up being interposed, it is okay, but if not, code might hit different static symbols.

Symbol versioning does not solve the problem in general either since
then you need a global version name management, and you need to fix
all applications to use versioning which is a huge amount of work.
Even then you can still have problems if the projects lack the rigour
required to update their version maps.

I had hoped it would be possible to use a static map with a single version-like string, like JSON-C, JSON-GLIB, and JANSSON. Maybe we could even use the soname for this by default.

This is easier than renaming everything under a shared prefix, and would not affect backward compatibility. Perhaps I'm a bit naïve, but I suspect this could be used to implement part of the linking semantics of -B direct and reap some of the performance benefits because it is easier to locate the correct hash table.

A different linking algorithm isn't helpful either because at static
link time you don't need the devel libraries for all dependent libraries,
and requiring it would make compiling anything much more complicated.

Sure, but I think static linking has no future. If it reappears, it will be in the form of LTO object files.

The only robust solution I see is a post-build tool that looks for
global namespace collisions and rejects the build if they exist.

That's difficult to do because anything that needs to consider more than one piece of software in isolation has a high overhead, both performance-wise and administratively.

If we decide to implement namespace management outside of the toolchain, I think we should have a list of symbols and symbol prefixes that map to library (soname, and then OS package) that defines them. If a library (or dynamic executable) uses a symbol outside that list, we'd either have to fix the list or address the unintentional out-of-namespace symbol leakage. Both measures can be taken even before any actual collision materializes, and it's only

The
workaround might be to register your allowed symbol interpositions in
the spec file such that the post-build tool can use those to resolve
such allowances. Note that just stating that symbol X may be interposed
is not sufficient to make this system safe, you must say symbol X from
SONAME Y may interpose.

I think it should be outside the SPEC file because of the syntax issues (no good parsers, parsing requires arbitrary code execution etc.).

What's your next step?

If there's a strong desire to implement -B direct, that's the way to go, but it's a bit out of my area of expertise. If there isn't, I'll look into writing an out-of-band namespace model for the upcoming Fedora Base package set.

--
Florian Weimer / Red Hat Product Security Team


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]