This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ppc64: Call to gettimeofday fails with segfault in __glink_PLTresolve because .plt0 is all zeros.


After a intense debug session, Carlos narrowed down the culprit: vDSO are not attached
in the process with an ODP section and the compiler/link does not create one using
the trick at sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h (it also does not make
sense to create one, since it should be library itself to provide suck entries, not
the modules which uses it). We checked on kernel code (I did also after dump the
vDSO object from the process).

By not having an ODP associated the IFUNC resolver for gettimeofday returns an entry
to BSS segment. The value for function descriptor would be correct pointing vDSO symbol,
however the TOC value can be anything. In fact with older linker, according with the
organization of symbols, it is indeed 0 leading to described issue. With newer linkers,
that changed slight they hash function used to organize the symbol placement, the BSS
place where the gettimeofday IFUNC resolver return is near the __cache_line_size and
then the TOC is not null by accident. The current behavior is working by chance and
the linker can break it by placing the symbol different in the BSS (such is the case
of the issue related).

A way to fix it, as pointed by Carlos, would be make the vDSO get ODP entries for the
symbols, then fixup _dl_vdso_vsym to return an OPD (which may or may not involve
changes to dl_lookup_symbol_x and DL_SYMBOL_ADDRESS). Although it is possible, there are
2 issues with:

1. Older kernel will still need some handling from GLIBC for the missing ODP entries.
2. It will require additional logic from userland in case for a new ABI which does not
   require ODPs. I will check with kernel guys, but I see that ODP are not exported
   to not tie userland with a specific ABI.

So, as also suggested by Carlos, I propose a more simple fix which is add static memory
location to act as artificial ODP entry for the vDSO symbol returned in IFUNC resolver.
The ODP entry is fill as the vDSO value (if it exists) its TOC is initialized to value
different from 0. It is need so PLT resolution does not trigger lazy resolution and it
works because none vDSO symbol uses the TOC. It also does not require any relocations.

I want to thanks Carlos O'Donnel for the much need help on this issue.

---

2013-11-07  Adhemerval Zanella  <azanella@linux.vnet.ibm.com>

	* sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h (VDSO_IFUNC_RET):
	Add artificial ODP entry for vDSO symbol for PPC64.
	* sysdeps/unix/sysv/linux/powerpc/gettimeofday.c: Adjust includes.
	* sysdeps/unix/sysv/linux/powerpc/time.c: Likewise.

--

diff --git a/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h b/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h
index ba54de4..70a8ca8 100644
--- a/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h
+++ b/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h
@@ -41,12 +41,23 @@ extern void *__vdso_sigtramp32;
 extern void *__vdso_sigtramp_rt32;
 #endif
 
-/* This macro is needed for PPC64 to return a skeleton OPD entry of a vDSO
-   symbol.  This works because _dl_vdso_vsym always return the function
-   address, and no vDSO symbols use the TOC or chain pointers from the OPD
-   so we can allow them to be garbage.  */
 #if defined(__PPC64__) || defined(__powerpc64__)
-#define VDSO_IFUNC_RET(value)  ((void *) &(value))
+/* vDSO symbols do not contain an associated OPD entry (the vDSO elf image
+   does not even contain a .odp section). However, PLT relocations for IFUNC
+   expects function resolvers to return an OPD address.
+   By just blindly returning the __vdso_xxx value it ends up returning a
+   pointer to BSS segment and it may eventually lead to an undefined behavior.
+   This code creates an arficial OPD to act as the vDSO one. The TOC value
+   is set to a non zero value to avoid triggering a lazy resolution through
+   .glink0/.plt0 when the TOC is compared to 0 for thread-safe PLT routines
+   (the vDSO symbol itself does not use the TCO or chain pointer).  */
+#define VDSO_IFUNC_RET(value)                            \
+  ({                                                     \
+    static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
+    vdso_opd.fd_func = (Elf64_Addr)value;                \
+    &vdso_opd;                                           \
+  })
+
 #else
 #define VDSO_IFUNC_RET(value)  ((void *) (value))
 #endif
diff --git a/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c b/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c
index 6506d75..48c3f84 100644
--- a/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c
+++ b/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c
@@ -22,6 +22,7 @@
 
 # include <dl-vdso.h>
 # include <bits/libc-vdso.h>
+# include <dl-machine.h>
 
 void *gettimeofday_ifunc (void) __asm__ ("__gettimeofday");
 
diff --git a/sysdeps/unix/sysv/linux/powerpc/time.c b/sysdeps/unix/sysv/linux/powerpc/time.c
index 66b4eb3..2d77ece 100644
--- a/sysdeps/unix/sysv/linux/powerpc/time.c
+++ b/sysdeps/unix/sysv/linux/powerpc/time.c
@@ -20,7 +20,9 @@
 
 # include <time.h>
 # include <sysdep.h>
+# include <dl-vdso.h>
 # include <bits/libc-vdso.h>
+# include <dl-machine.h>
 
 void *time_ifunc (void) asm ("time");



On 05-11-2013 19:42, Carlos O'Donell wrote:
> On 11/05/2013 10:03 AM, Alan Modra wrote:
>> On Tue, Nov 05, 2013 at 12:56:32AM -0500, Carlos O'Donell wrote:
>>> # PLT call stub (note that build_plt_stub() in bfd doesn't say much about
>>> # why the stub does what it does so I my analysis might be wrong)...
>>>    11dfc:       f8 41 00 28     std     r2,40(r1)
>>>    11e00:       e9 62 99 a8     ld      r11,-26200(r2)
>>>    11e04:       7d 69 03 a6     mtctr   r11
>>> # At this point r11 points at the address of the VDSO __kernel_gettimeofday
>> Which says you have had a successful resolution of that plt entry at
>> some point.  The PLT is .bss on powerpc64, so it starts all zero then
>> ld.so initialises the function entry address (first dword of each 2 or
>> 3 dword plt entry) to point at the glink code if doing lazy linking or
>> resolves to the final addresses if LD_BIND_NOW=1.
> This is a binutils bug, that has likely been fixed in a newer binutils
> (which I'm not using).
>
> The reduced test case looks like this:
>
> cat >> test-1026533.c <<EOF
> #include <stdio.h>
> #include <stdlib.h>
> #include <dlfcn.h>
>
> union fptr {
>   int (*func)(void);
>   void *vfunc;
> };
>
> int
> main (void)
> {
>   void *handle;
>   union fptr myfptr;
>   char *error;
>   int ret;
>   handle = dlopen ("./lib-1026533.so", RTLD_NOW);
>   if (handle == NULL)
>     {
>       fprintf (stderr, "FAIL: dlopen failed (%s)\n", dlerror ());
>       exit (1);
>     }
>   dlerror ();
>   myfptr.vfunc = dlsym (handle, "func_1026533");
>   if ((error = dlerror ()) != NULL)
>     {
>       fprintf (stderr, "FAIL: dlsym failed (%s)\n", dlerror ());
>       exit (1);
>     }
>   ret = myfptr.func ();
>   if (ret != 0xdeadbeef)
>     {
>       fprintf (stderr, "FAIL: dlopen'd library function failed to "
> 		       "return expected value.\n");
>       exit (1);
>     }
>   dlclose (handle);
>   fprintf (stderr, "PASS: dlopen'd library function did not crash.\n");
>   exit (0);
> }
> EOF
>
> cat >> lib-1026533.c <<EOF
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
>
> int
> func_1026533 (void)
> {
>   struct timeval now;
>   int ret;
>   ret = gettimeofday (&now, NULL);
>   if (ret == -1)
>     {
>       perror ("gettimeofday");
>       fprintf (stderr, "FAIL: gettimeofday returned -1.\n");
>       exit (1);
>     }
>   fprintf (stdout, "Time is %ld.%06ld.\n", now.tv_sec, now.tv_usec);
>   return 0xdeadbeef;
> }
> EOF
>
> gcc -Wl,--plt-thread-safe -g3 -O0 -Wall -pedantic -m64 -shared -fPIC -Wl,--soname,lib-1026533.so -o lib-1026533.so lib-1026533.c
> gcc -Wl,--plt-thread-safe -ldl -g3 -O0 -Wall -pedantic -m64 -o test-1026533 test-1026533.c
>
> On a system *with* an IFUNC for gettimeofday:
> ./test-1026533 
> Segmentation fault
> echo $?
> 139
>
> On a system *without* an IFUNC for gettimeofday:
> ./test-1026533 
> Time is 1383686757.603343.
> PASS: dlopen'd library function did not crash.
> echo $?
> 0
>
> It is the critical combination of the new thread-safe plt stubs 
> *and* the VDSO OPD being incomplete, followed by a dlopen with 
> RTLD_NOW which cause the failure.
>
> The caller dlopen's the library with RTLD_NOW, thus the dynamic
> loader expects never to need .plt0 and doesn't set it up.
> However, because the IFUNC OPD has 0 for a TOC, because it's
> incomplete for some reason (the bug?), and because the new
> thread-safe plt sequences check for r2==0 and attempt a lazy
> resolution if it is, we end up in a situation where we are
> calling .glink0 without having setup .plt0.
>
> This combination of events happens with glibc's gettimeofday 
> (new enough glibc with IFUNC for that symbol) when called from
> librpmio.so.1 from python (which always uses RTLD_NOW with
> dlopen). However, it looks like it only happens when an older
> binutils e.g. 2.20.51.0.2 is used to build glibc.
>
> Adhemerval says that 2.23.52.20130531 doesn't have the problem
> with the IFUNC's OPD having a zero TOC.
>
> Any idea what patch fixed this?
>
> Cheers,
> Carlos.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]