This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [osol-discuss] shared library symbols at address 0x00000000
- From: Rod Evans <Rod dot Evans at sun dot com>
- To: Martin Man <Martin dot Man at sun dot com>
- Cc: gnusol-devel at gnusolaris dot org, binutils at sourceware dot org
- Date: Tue, 21 Nov 2006 10:17:46 -0800
- Subject: Re: [osol-discuss] shared library symbols at address 0x00000000
- References: <455DE0C7.3040303@sun.com> <455DEFCC.2070206@sun.com> <45617F57.9060802@sun.com> <4561FE8D.1090802@sun.com> <45632BFF.2030307@sun.com>
- Reply-to: Rod dot Evans at sun dot com
Martin Man wrote:
Hi Rod,
[ CCing binutils list because this explanation might be useful for
developers there to see where GNU ld stands regarding the filter symbols ]
Rod Evans wrote:
Some more information - perhaps examples would be better.
>
Here is a traditional dlopen() example. Note, the caller
establishes a .plt, and a reference to libdl. libdl is an
object filter that causes redirection to ld.so.1 at
runtime.
ok,
oxpoly 406. cat > xxx.c
#include <dlfcn.h>
void main()
{
dlopen("foo.so.1", RTLD_LOCAL);
}
oxpoly 407. elfdump -N.dynsym -s /lib/libdl.so.1 | fgrep dlopen
[1] 0x00000000 0x00000000 FUNC GLOB D 9 ABS
dlopen
oxpoly 408. elfdump -d /lib/libdl.so.1 | fgrep FILTER
[1] FILTER 0xe6 /usr/lib/ld.so.1
oxpoly 409. cc -o xxx1 xxx.c -ldl
oxpoly 411. elfdump -N.dynsym -s xxx1 | fgrep dlopen
[3] 0x00020cd8 0x00000000 FUNC GLOB D 0 UNDEF
dlopen
oxpoly 412. elfdump -r xxx1 | fgrep dlopen
R_SPARC_JMP_SLOT 0x20cd8 0 .rela.plt
dlopen
Note, the caller has no ABS symbol, only the destination does. But in
effect the ABS is irrelevant, as the caller doesn't bind to the
destination
at runtime, the caller binds to ld.so.1 through the filtering mechanism.
I don't understand what is the use of filtering mechanism here, if I
explicitly stated that xxx1 will be linked against -ldl
In the Solaris implementation, dlopen() is an interface that exists in
the runtime linker, ld.so.1. This is the core code that is used to
load any dynamic object. Effectively, when you load the dependency libc,
it is dlopen'ed for you.
The runtime linker is part of your process, but there is no model to
link against it directly. To provide access to selective functions within
the runtime linker, we invented libdl.so.1. libdl has no content, there
is no implementation backing the symbols that it defines - that's one
reason the symbols are ABS :-). At runtime, libdl acts as a filter,
causing the redirection of any binding to the interfaces offered by
ld.so.1 itself. The intent is to be selective - there are many other
routines in ld.so.1, but we don't want users calling them.
With per-symbol filters, we have a similar scenario:
oxpoly 413. elfdump -N.dynsym -s /lib/libc.so.1 | fgrep dlopen
[2328] 0x00000000 0x00000000 FUNC GLOB D 5 ABS
dlopen
oxpoly 414. elfdump -d /lib/libc.so.1 | fgrep FILTER
[1] SUNW_FILTER 0xb35c /usr/lib/ld.so.1
oxpoly 415. elfdump -y /lib/libc.so.1 | fgrep dlopen
[2328] F [1] /usr/lib/ld.so.1 dlopen
oxpoly 417. elfdump -N.dynsym -s xxx2 | fgrep dlopen
[3] 0x00020ca0 0x00000000 FUNC GLOB D 0 UNDEF
dlopen
oxpoly 418. elfdump -r xxx2 | fgrep dlopen
R_SPARC_JMP_SLOT 0x20ca0 0 .rela.plt
dlopen
Again, the caller makes the same reference. However, this time the
implementation also acts as a filter, but only for those symbols
explicitly
tagged to be filters.
I can't find in the example above how was the xxx2 compiled/linked. I
suppose you omitted -ldl, in which case libc defined dlopen was used to
filter calls to dlopen through /usr/lib/ld.so.1
Opps, sorry. Yes, the compile line was:
cc -o xxx2 xxx.c
libdl.so.1 still exists, and acts as it always has, as an object filter.
But, folks observed that it would be simpler to bind to libc for the
dl* interfaces (other OS's provided this too). Plus, why do we have to
waste our time loading a different library just to get the filtering,
when every application loads libc anyway. So, the dl* interfaces were
added to libc as symbol filters.
So, if I read your bug reports correctly, you might have a couple of
issues:
i. the linker is propagating the ABS index to the caller, and in
so doing invalidating the callers reference, and/or
yes, this is my probable cause with GNU ld
ii. you have an environment where you runtime linker does not
understand the SUNW_FILTER/SYMINFO_FLG_FILTER implementation,
and thus binds the call to the ABS symbol expecting some
implementation code to back it.
yes, I suppose this is the root cause of (i), and I'm trying to seek
some additional information.
As a side note. Could you elaborate more about where is per-library and
per-symbol filtering useful, what is it used for, and what are the
drawbacks of it? I have read the Solaris Linker and Libraries Guide and
seem to have identified that it is used to
"abstract compilation environment from the run-time environment"
which sounds nice, but makes (so far) not much sense to me. Is it
something related to binary compatibility?
Way back in Solaris 2.0, the object filtering mechanism was invented to
satisfy the libdl.so.1 model I've already explained. However, the
concept found other uses. As these uses evolved and expanded, so
the thought of providing more precision with less overhead evolved
into per-symbol filters. Here's some examples, but there are many more.
The standards folks created a library, libxnet, which defined a number
of interfaces that Solaris already had implemented in different libraries.
Instead of duplicating the interfaces, we used a filter. This filter
defines the interfaces (therefore making the compilation/linking
environment happy), but redirects the binding at runtime to the one
true implementation. libxnet used to define an object filter:
% elfdump -d /usr/lib/libxnet.so.1
...
[1] FILTER 0x4b6 libsocket.so.1:libnsl.so.1:libc.so.1
For example, one interface offered by libxnet is gethostname(), which we
have implemented in libc. The runtime binding of this interface would
first look for it in libsocket, then libnsl, and then libc. A bit
wasteful, but it worked. We now employ per-symbol filters, so that
a binding to libxnet is redirected precisely to the correct dependency:
% elfdump -y /lib/libxnet.so.1 | fgrep libc.so.1
[55] F [2] libc.so.1 gethostname
Another technique uses auxiliary filters. These are filters that redirect
to an alternative (just as normal filters do), but should no alternative
exist, the binding falls back to the filter itself. These filter
symbols aren't ABS, as they actually have backing code. The SPARC guys,
and benchmarkers, love these symbols, as they can use them to point to
faster implementations of generic functions. For example, the memcpy()
family.
A user binds to memcpy() in libc, but this library was set up as an
auxiliary filter one our SPARC platforms:
% elfdump -d /usr/lib/libc.so.1
[4] AUXILIARY 0x5b8b /usr/platform/$PLATFORM/lib/libc_psr.so.1
The intent was, that if memcpy() was called, and a platform specific (psr)
library was found, the memcpy() would be bound to that "optimized" version.
If the psr library didn't exist, or the symbol didn't exist in the psr
library, the binding would fall back to the generic implementation in libc.
The problem with this object filter is that *every* symbol that would be
bound to libc was first searched for in the psr library. Turns out this
wasn't a huge overhead, but loading the psr library to initiate the search,
when the object never made any mem* calls, could be felt.
So, now we implement this same model with per-symbol filters:
% elfdump -y /lib/libc.so.1 | fgrep memcpy
[93] A [2] /platform/$PLATFORM/lib/libc_psr.so.1 memcpy
Only if a reference to memcpy is made will the psr library be searched
for and loaded. More precise, less overhead.
We're now using per-symbol filters to remove duplicate clutter. There's
an observation that an interface should typically live in one place.
But, like all large software projects, we've found cut-and-paste has
resulted in the same interfaces being offered from more than one library.
We could simply remove one implementation. However, there's a small
chance that a user directly bound to the implementation we want to remove, or
dlopen'd and dlsym'ed for the symbol. They might get lucky and "fall through"
to the one implementation. To remove any doubt, we've used per-symbol filters
to maintain an existing libraries interface, but redirect the runtime
binding to the one single implementation.
For example, we had some math routines defined in libc, but the math
library that undergoes most development is libm. It was troublesome
having to make sure the libc versions were up-to-date. Now, we don't
have to. Per-symbol filters preserve the existing interface while
pointing at the one single implementation:
% elfdump -y /lib/libc.so.1 | fgrep libm.so.2
[6] F [0] libm.so.2 scalb
[40] F [0] libm.so.2 _isnan
[383] F [0] libm.so.2 _logb
....
Hope the explanation helps. Plus there's a blog that states some of
the same material:
http://blogs.sun.com/rie/entry/shared_object_filters
--
Rod.