This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Incompatibility between GNU-ld and SUN's ld.so.1


Hi,

first: I'd appreciate to be CC'ed on replies but I'll try to follow
the thread in the archives.

[ I still think this is a Problem of Suns ld.so.1 and I have an open
  CALL with Sun. However, as this problem is triggered by libstdc++
  and libgcc_s and the Sun behaviour dates back to Solaris 7 (or even
  earlier) it would be helpful if GNU-ld could work around this problem.
]

Here's the relevant part of my report sent to SUN (I guess you'd
prefere to use Makefile instead of Makefile.sun. However, note that
using -nostdlib will cause a different crash due to a missing exit()):

----------------- cut here --------------------------------------------

SUMMARY DESCRIPTION: ld.so.1 fails to relocate certain shared libraries

DETAILED DESCRIPTION:

The dynamic runtime linker fails to relocate valid shared libraries
generated by recent versions of GNU-ld. /usr/local/bin/ld is from
the GNU binutils-2.13 package:

       turing$ /usr/local/bin/ld -v
       GNU ld version 2.13

How to reproduce:

Script started on Fri Sep 20 19:46:43 2002
turing$ cat t2.c
struct object {
        int i;
        int j;
        int k;
        int l;
};



int func ()
{
        static struct object x;
        struct object * p;
        p = &x;
        p->i = 3;
        return 0;
}

turing$ cat t3.c
extern int func();

int main ()
{
        func();
        return 0;
}
turing$ cat Makefile.sun
.PHONY: clean
all:    a.out
t2.o:   t2.c
        CC  -c -KPIC t2.c
libt2.so:       t2.o
        /usr/local/bin/ld -G t2.o -olibt2.so
t3.o:   t3.c
        CC  -c t3.c
a.out: libt2.so t3.o
        CC  -lt2 t3.o -L. -R.
clean:
        rm -f *.so *.o a.out

turing$ cat Makefile
.PHONY: clean
all:    a.out
t2.o:   t2.c
        gcc -c -fPIC t2.c
libt2.so: t2.o
        /usr/local/bin/ld -nostdlib -shared -olibt2.so t2.o
a.out: libt2.so t3.c
        gcc -nostdlib t3.c libt2.so -L. -R. 
clean:
        rm -f *.so *.o a.out core

turing$ make -f Makefile.sun clean
rm -f *.so *.o a.out
turing$ make -f Makefile.sun 
CC  -c -KPIC t2.c
/usr/local/bin/ld -G t2.o -olibt2.so
CC  -c t3.c
CC  -lt2 t3.o -L. -R.
turing$ a.out
Segmentation Fault (core dumped)
turing$ exit

script done on Fri Sep 20 19:47:32 2002

Note that I compiled everything with /opt/SUNWspro/bin/CC to
rule out bugs in gcc. This problem can be reproduced using
the second Makefile and gcc with an even smaller resulting
executable.


Analyzing the core shows the following:
turing$ pmap core | grep libt2.so
FF370000      8K read/exec         libt2.so
FF380000      8K read/write/exec   libt2.so

Script started on Fri Sep 20 19:53:10 2002
turing$ gdb a.out core
GNU gdb 5.0
[ ... ]
#0  0xff370318 in __1cEfunc6F_i_ ()
   from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so
(gdb) disass
Dump of assembler code for function __1cEfunc6F_i_:
0xff3702e0 <__1cEfunc6F_i_>:    save  %sp, -112, %sp
0xff3702e4 <__1cEfunc6F_i_+4>:  call  0xff3702ec <__1cEfunc6F_i_+12>
0xff3702e8 <__1cEfunc6F_i_+8>:  sethi  %hi(0), %o1
0xff3702ec <__1cEfunc6F_i_+12>: mov  %o1, %o1   ! 0x0
0xff3702f0 <__1cEfunc6F_i_+16>: add  %o7, %o1, %o1
0xff3702f4 <__1cEfunc6F_i_+20>: st  %o1, [ %fp + -12 ]
0xff3702f8 <__1cEfunc6F_i_+24>: sethi  %hi(0x10000), %o0
0xff3702fc <__1cEfunc6F_i_+28>: or  %o0, 0xc4, %o0      ! 0x100c4
0xff370300 <__1cEfunc6F_i_+32>: add  %o1, %o0, %l7
0xff370304 <__1cEfunc6F_i_+36>: sethi  %hi(0), %g1
0xff370308 <__1cEfunc6F_i_+40>: or  %g1, 4, %g1 ! 0x4
0xff37030c <__1cEfunc6F_i_+44>: ld  [ %l7 + %g1 ], %o0
0xff370310 <__1cEfunc6F_i_+48>: st  %o0, [ %fp + -8 ]
0xff370314 <__1cEfunc6F_i_+52>: mov  3, %o1
0xff370318 <__1cEfunc6F_i_+56>: st  %o1, [ %o0 ]
0xff37031c <__1cEfunc6F_i_+60>: clr  [ %fp + -4 ]
0xff370320 <__1cEfunc6F_i_+64>: mov  %g0, %i0
0xff370324 <__1cEfunc6F_i_+68>: ret 
0xff370328 <__1cEfunc6F_i_+72>: restore 
0xff37032c <__1cEfunc6F_i_+76>: mov  %g0, %i0
0xff370330 <__1cEfunc6F_i_+80>: ret 
0xff370334 <__1cEfunc6F_i_+84>: restore 
---Type <return> to continue, or q <return> to quit---
End of assembler dump.
(gdb) bt
#0  0xff370318 in __1cEfunc6F_i_ ()
   from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so
#1  0x10884 in main ()
(gdb) info reg o0
o0             0xff370000       -13172736
(gdb) info reg o1
o1             0x3      3
(gdb) info reg l7
l7             0xff3803a8       -13106264
(gdb) info reg g1
g1             0x4      4
(gdb) turing$ exit

script done on Fri Sep 20 19:54:46 2002

Looking back at function func from t2.c shows:
int func ()
{
	static struct object x;
	struct object * p;
	p = &x;
	p->i = 3;      <====== crash is here.
	return 0;
}

The value of the pointer p is obviously in register o0, i.e. it is
0xff370000. This is precisely the BASE address where the shared library
libt2.so has been mapped to. Register l7 contains the base address of
the .got section (the global offset table of this library). The
questionable address is loaded from offset 4 in the global offset table.

Looking at the contents of the global offset table in the shared
library shows the following:

turing$ elfdump -G libt2.so 

Global Offset Table: 2 entries
 ndx     addr      value    reloc              addend   symbol
[00000]  000103a8  00010338 R_SPARC_NONE       00000000 
[00001]  000103ac  000103b0 R_SPARC_RELATIVE   00000000 
turing$ 

Note that we have indeed
%l7(0xff3803a8) = Offset of .got(0x000103a8) + library base address(0xFF370000)

The Solaris Linker and Libraries Guide (freshly downloaded from
docs.sun.com) has this explanation for R_SPARC_RELATIVE:

|Some relocation types have semantics beyond simple calculation:
|[ ... ]
|R_SPARC_RELATIVE
|  Created by the link-editor for dynamic objects. Its offset member
|  gives the location within a shared object that contains a value
|  representing a relative address. The runtime linker computes the
|  corresponding virtual address by adding the virtual address at which
|  the shared object is loaded to the relative address. Relocation
|  entries for this type must specify 0 for the symbol table index.

This means that the value at offset 0x4 in the global offset
Table should be
      library base address  + Value in .got
      0xFF370000            + 0x000103B0     = 0xFF3803B0
after relocation. However looking at the value of register o0 we
see that the .got section obviously contains the value 0xFF37B000
instead.

----------------- cut here --------------------------------------------

The basic problem is the interpretation of the meaning of
R_SPARC_RELATIVE. Recall the explanation from above:

[ The same document also states that the calculation performed by
  R_SPARC_RELATIVE is B+A (see Terminologie below). IMHO this is
  overruled by the first sentence quoted below.
]

|Some relocation types have semantics beyond simple calculation:
|[ ... ]
|R_SPARC_RELATIVE
|  Created by the link-editor for dynamic objects. Its offset member
|  gives the location within a shared object that contains a value
|  representing a relative address. The runtime linker computes the
|  corresponding virtual address by adding the virtual address at which
|  the shared object is loaded to the relative address. Relocation
|  entries for this type must specify 0 for the symbol table index.


This explanation is obviously derived from the SHT_REL case where
the ``relative address'' explained above and the implicit addend
are the same.

Terminologie:
* B is the baseaddress where the library is loaded
* A is the EXPLICIT addend
* V is the value stored in the shared library where an implicit addend
  would reside (IMHO this is what ``relative address'' above describes).

The SUN-Linker used to always calculate V + B + A for R_SPARC_RELATIVE
relocations, however, starting with Solaris 7 and the advent of
DT_RELACOUNT it calculates only B+A (ignoring V completly) iff
DT_RELACOUNT is actually supplied and explicit addends are used.

ld could work around this by always storing the relative address in
the addend and setting V to 0 if explicit addends are used. This is
what SUN's linker has done for quite some time.

Note: This incompatibility is the cause of recent gcc Bugreports
that see crashes in __register_info_frame_bases when starting any
C++ program.

    Regards   Christian

-- 
THAT'S ALL FOLKS!


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]