This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: Prelinking of shared libraries
- To: aj at suse dot de
- Subject: Re: Prelinking of shared libraries
- From: "Martin v. Loewis" <martin at loewis dot home dot cs dot tu-berlin dot de>
- Date: Fri, 4 May 2001 23:23:25 +0200
- CC: libc-alpha at sources dot redhat dot com, bastian at kde dot org
- References: <hoae4tvafc.fsf@gee.suse.de>
> So what relocations can benefit from this? Let's look at i386 and
> take libkdecore.so as an example, other architectures should be
> similar.
How did you get these data? Looking at libkdecore.so.2.0.0, as
distributed with SuSE 7.0 (klibs-1.1.2-160), I get
[ 6] .rel.data REL 00013898 013898 001a50 08 A 2 10 4
[ 7] .rel.eh_frame REL 000152e8 0152e8 002570 08 A 2 11 4
[ 8] .rel.gcc_except_t REL 00017858 017858 00b2b0 08 A 2 12 4
[ 9] .rel.got REL 00022b08 022b08 0005c8 08 A 2 15 4
[10] .rel.plt REL 000230d0 0230d0 0016a0 08 A 2 c 4
Classifying the individual relocations, I get
R_386_RELATIVE: 7219
R_386_32: 539
R_386_GLOB_DAT: 185
R_386_JUMP_SLOT: 724
> So out of 4078 relocations we could get rid of 793 - and only of the
> cheapest relocations.
In summary, I find that out of 8667 relocations, prelinking the
R_386_RELATIVE ones would save 7219.
That raises two questions: Why do we get different numbers for the
same shared library? And why do I have so many R_386_RELATIVE
relocations when the code should be PIC?
It turns out that most of those relocations are in the .eh_frame and
the .gcc_except_table, so this looks more like a GCC question: Is it
that those tables are not position-independent? If so, C++ programs
would suffer more than C programs.
Furthermore, I don't understand how pre-linking would proceed. It
appears that all you had to do is to define a VMA, and relocate a
certain section to that VMA (updating the sections sh_addr, and the
relocation entries appropriately). Then, the dynamic linker should
attempt to map that section to this VMA; if that succeeds, no
relocations need to be applied. Is that possible? If so, where does
the added complexity that nullifies the performance gain come from?
Regards,
Martin