This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] BZ #14831: Segfault in _dl_profile_fixup with IRELATIVEand LD_AUDIT
On Wed, Nov 14, 2012 at 10:07 AM, Carlos O'Donell
<carlos@systemhalted.org> wrote:
> On Wed, Nov 14, 2012 at 12:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Nov 13, 2012 at 6:37 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Nov 13, 2012 at 4:50 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Tue, Nov 13, 2012 at 4:26 PM, Roland McGrath <roland@hack.frob.com> wrote:
>>>>>> + if (l->l_reloc_result == NULL)
>>>>>> + {
>>>>>> + /* Skip audit if l_reloc_result is NULL which happens with
>>>>>> + IRELATIVE relocations in other DSOs, like libm.so. */
>>>>>> + *framesizep = -1;
>>>>>
>>>>> This needs a more extensive comment about how it arises that we get here
>>>>> with l_reloc_result NULL and why it is OK to short-circuit this way.
>>>>
>>>> R_X86_IRELATIVE lazy relocation in libm.so leads to:
>>>>
>>>> 00042010 00000507 R_386_JUMP_SLOT 00000000 __get_cpu_features
>>>>
>>>> But __get_cpu_features isn't set up yet and requires a lazy relocation.
>>>> That is why l_reloc_result is NULL. We don't want audit in R_X86_IRELATIVE
>>>> relocations. What is what my patch does.
>>>>
>>>>>> + return _dl_fixup (l, reloc_arg);
>>>>>
>>>>> How can this be right when ELF_MACHINE_RUNTIME_FIXUP_ARGS is nonempty?
>>>>>
>>>>
>>>> My post has
>>>>
>>>> ---
>>>> Targets which define ELF_MACHINE_RUNTIME_FIXUP_ARGS must find a way
>>>> to pass ELF_MACHINE_RUNTIME_FIXUP_ARGS from _dl_fixup/_dl_profile_fixup
>>>> to _dl_runtime_fixup.
>>>> ---
>>>>
>>>> Only
>>>>
>>>> ports/sysdeps/m68k/dl-machine.h:#define ELF_MACHINE_RUNTIME_FIXUP_ARGS
>>>> long int save_a0, long int save_a1
>>>> sysdeps/sh/dl-machine.h:#define ELF_MACHINE_RUNTIME_FIXUP_ARGS int plt_type
>>>>
>>>> I can work with sh and m68k maintainers on a solution. It shouldn't be
>>>> too hard.
>>>
>>> Here is the updated patch.
>>>
>>>
>>
>> Here is what happened:
>>
>>
>> (gdb) bt
>> #0 _dl_profile_fixup (l=0x7ffff7ff6908, reloc_arg=4, retaddr=140737342864537,
>> regs=0x7fffffffd8b0, framesizep=0x7fffffffdc08) at ../elf/dl-runtime.c:191
>> #1 0x00007ffff7df01e6 in _dl_runtime_profile ()
>> at ../sysdeps/x86_64/dl-trampoline.h:48
>> #2 0x00007ffff753fc99 in ?? ()
>> #3 0x00007fffffffdd50 in ?? ()
>> #4 0x00007ffff7de6f41 in elf_machine_lazy_rel (skip_ifunc=<optimized out>,
>> reloc=0x7ffff7535110, l_addr=140737342799872, map=0x7ffff7ff6908)
>> at ../sysdeps/x86_64/dl-machine.h:511
>> #5 elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>,
>> nrelative=<optimized out>, relsize=<optimized out>,
>> reladdr=<optimized out>, map=0x7ffff7ff6908) at do-rel.h:77
>> #6 _dl_relocate_object (scope=0x7ffff7ff6c60, reloc_mode=<optimized out>,
>> consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:265
>> #7 0x00007ffff7ddfaa2 in dl_main (phdr=<optimized out>, phnum=4160718464,
>> user_entry=<optimized out>, auxv=0x7ffff7ffe101 <cache_new+1>)
>> at rtld.c:2203
>> #8 0x00007ffff7df0f7e in _dl_sysdep_start (
>> start_argptr=start_argptr@entry=0x7fffffffdf50,
>> dl_main=dl_main@entry=0x7ffff7ddda40 <dl_main>) at ../elf/dl-sysdep.c:241
>> #9 0x00007ffff7de0ce6 in _dl_start_final (arg=0x7fffffffdf50) at rtld.c:331
>> #10 _dl_start (arg=0x7fffffffdf50) at rtld.c:557
>> #11 0x00007ffff7ddd508 in _start ()
>> ---Type <return> to continue, or q <return> to quit---
>> from /export/build/gnu/glibc-test/build-x86_64-linux/elf/ld.so
>> #12 0x0000000000000001 in ?? ()
>> #13 0x00007fffffffe262 in ?? ()
>> #14 0x0000000000000000 in ?? ()
>> (gdb) f 6
>> #6 _dl_relocate_object (scope=0x7ffff7ff6c60, reloc_mode=<optimized out>,
>> consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:265
>> 265 ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
>> (gdb) f 4
>> #4 0x00007ffff7de6f41 in elf_machine_lazy_rel (skip_ifunc=<optimized out>,
>> reloc=0x7ffff7535110, l_addr=140737342799872, map=0x7ffff7ff6908)
>> at ../sysdeps/x86_64/dl-machine.h:511
>> 511 value = ((ElfW(Addr) (*) (void)) value) ();
>> (gdb)
>>
>> We have
>>
>> ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
>>
>> #ifndef PROF
>> if (__builtin_expect (consider_profiling, 0))
>> {
>> /* Allocate the array which will contain the already found
>> relocations. If the shared object lacks a PLT (for example
>> if it only contains lead function) the l_info[DT_PLTRELSZ]
>> will be NULL. */
>> if (l->l_info[DT_PLTRELSZ] == NULL)
>> {
>> errstring = N_("%s: no PLTREL found in object %s\n");
>> fatal:
>> _dl_fatal_printf (errstring,
>> rtld_progname ?: "<program name unknown>",
>> l->l_name);
>> }
>>
>> l->l_reloc_result = calloc (sizeof (l->l_reloc_result[0]),
>> l->l_info[DT_PLTRELSZ]->d_un.d_val);
>> if (l->l_reloc_result == NULL)
>> {
>> errstring = N_("\
>> %s: out of memory to store relocation results for %s\n");
>> goto fatal;
>> }
>> }
>> #endif
>> }
>>
>> ld.so crashes because IRELATIVE relocation invokes another relocation
>> before l_reloc_result is allocated. My patch simply avoids it. Here is
>> the patch with updated comments. OK to install?
>
> So the problem is that the objects are loaded in reverse order (for copy reloc
> purposes) and therefore we load libm.so *before* libc.so, and thus
> l_reloc_result hasn't been allocated for libc.so yet, and thus trying to
> resolve a function in libc.so will fail if we have profiling enabled because
> that entry in the link_map is not yet allocated (because we haven't called
> _dl_relocate_object for libc.so yet).
>
> We are going to eventually allocate l_reloc_result anyway. Could we instead
> refactor the l_reloc_result allocation out of _dl_relocate_object and into
> the loop in rtld.c which loads all the objects?
>
It isn't libc.so's l_reloc_result hasn't been allocated. It is libm.so's
l_reloc_result hasn't been allocated since we have
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
...
l->l_reloc_result = calloc (sizeof (l->l_reloc_result[0]),
Allocate l_reloc_result before ELF_DYNAMIC_RELOCATE causes
tst-audit2 failure:
{abcdef72, d8675309} != {d8675309, abcdef72}
since wrong calloc is called. How about this updated comments?
--
H.J.
---
diff --git a/elf/dl-runtime.c b/elf/dl-runtime.c
index c4c57c2..fa15472 100644
--- a/elf/dl-runtime.c
+++ b/elf/dl-runtime.c
@@ -1,5 +1,5 @@
/* On-demand PLT fixup for shared objects.
- Copyright (C) 1995-2009, 2010, 2011 Free Software Foundation, Inc.
+ Copyright (C) 1995-2012 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -166,12 +166,11 @@ _dl_profile_fixup (
if (l->l_reloc_result == NULL)
{
- /* Resolve an IRELATIVE relocation in another DSO may reference a
- function defined in libc.so before l_reloc_result is allocated.
- For example, __get_cpu_features in libc.so is called to resolve
- R_X86_64_IRELATIVE relocations in x86-64 libm.so. Skip audit and
- resolve the function in this case. It is OK since we aren't
- supposed to audit IRELATIVE relocations. */
+ /* ELF_DYNAMIC_RELOCATE is called before l_reloc_result is allocated.
+ We will get here if ELF_DYNAMIC_RELOCATE calls a resolver function
+ to resolve IRELATIVE relocation. For example, resolver in x86-64
+ libm.so calls __get_cpu_features defined in libc.so. Skip audit
+ and resolve the external function in this case. */
*framesizep = -1;
return _dl_fixup (
# ifdef ELF_MACHINE_RUNTIME_FIXUP_ARGS