This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: RFC: TLS improvements for IA32 and AMD64/EM64T
- From: Alexandre Oliva <aoliva at redhat dot com>
- To: libc-alpha at sources dot redhat dot com
- Date: Mon, 06 Feb 2006 14:32:02 -0200
- Subject: Re: RFC: TLS improvements for IA32 and AMD64/EM64T
- References: <ord5n91tx5.fsf@livre.oliva.athome.lsd.ic.unicamp.br> <orzmqdzhxi.fsf@livre.oliva.athome.lsd.ic.unicamp.br> <orwtlfwmy8.fsf@livre.oliva.athome.lsd.ic.unicamp.br> <orr7bh8up9.fsf@livre.oliva.athome.lsd.ic.unicamp.br>
On Sep 22, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Sep 17, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>>> Over the past few months, I've been working on porting to IA32 and
>>>> AMD64/EM64T the interesting bits of the TLS design I came up with for
>>>> FR-V, achieving some impressive speedups along with slight code size
>>>> reductions in the most common cases.
>>>> Although the design is not set in stone yet, it's fully implemented
>>>> and functional with patches I'm about to post for binutils, gcc and
>>>> glibc mainline, as follow-ups to this message, except that the GCC
>>>> patch will go to gcc-patches, as expected.
>>> This is the glibc portion of the implementation. I've built it for
>>> amd64-linux-gnu and i686-pc-linux-gnu, with both the current TLS
>>> dialect and the new -mtls-dialect=gnu2, without regressions.
>> Revised patch using new relocation and dynamic entry numbers. Tested
>> again on all 4 combinations mentioned above.
> Revised patch that does not include other changes, fixes a few
> inconsistent uses of addends here and there (copy&pastos), avoids
> crashing when lazily resolving TLSDESC relocations to weak symbols
> that turn out to be undefined, and, as a bonus, handles them correctly
> such that their address does map to NULL for all threads. I know
> people shouldn't rely on this, since the linker may very well break
> it, but hey, since I was already tweaking the code to avoid crashes,
> why not take the final step and make it work as closely as possible to
> a random person's expectations? :-)
> Index: ChangeLog
> from Alexandre Oliva <aoliva@redhat.com>
> Introduce TLS descriptors for i386 and x86_64.
binutils 2.17 and GCC 4.2 will be able to generate code conforming to
this ABI extension, so ideally GLIBC should be able to support the
corresponding relocations. The patch has been pending for a while and
AFAIK it was not even reviewed yet :-(
This was re-tested with yesterday's glibc tree, using GCC and binutils
trunks as of one or two weeks ago, no regressions on 32- or 64-bit
builds using the old and the new dynamic access models, with the use
of Initial Exec enabled or forced disabled (for a total of 8 builds).
Index: ChangeLog
from Alexandre Oliva <aoliva@redhat.com>
Introduce TLS descriptors for i386 and x86_64.
* elf/dl-reloc.c (_dl_try_allocate_static_tls): Extract from...
(_dl_allocate_static_tls): ... here. Rearrange failure path.
(TRY_STATIC_TLS): New macro.
* elf/dl-conflict.c (TRY_STATIC_TLS): Dummy define.
* elf/elf.h (DT_TLSDESC_GOT, DT_TLSDESC_PLT): Define.
(R_386_TLS_GOTDESC, R_386_TLS_DESC_CALL, R_386_TLS_DESC): Define.
(R_X86_64_PC64, R_X86_GOTOFF64, R_X86_64_GOTPC32): Merge from
binutils.
(R_X86_64_GOTPC32_TLSDESC, R_X86_64_TLSDESC_CALL,
R_X86_64_TLSDESC): Define.
(R_386_NUM, R_X86_64_NUM): Adjust.
* sysdeps/i386/Makefile (sysdep-dl-routines, sysdep_routines,
systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
* sysdeps/i386/dl-lookupcfg.h: New file. Introduce _dl_unmap to
release tlsdesc_table.
* sysdeps/i386/dl-machine.h: Include dl-tlsdesc.h.
(elf_machine_type_class): Mark R_386_TLS_DESC as PLT class.
(elf_machine_rel): Handle R_386_TLS_DESC.
(elf_machine_rela): Likewise.
(elf_machine_lazy_rel): Likewise.
(elf_machine_lazy_rela): Likewise.
* sysdeps/i386/dl-tls.h (struct dl_tls_index): Name it.
* sysdeps/i386/dl-tlsdesc.S: New file.
* sysdeps/i386/dl-tlsdesc.h: New file.
* sysdeps/i386/tlsdesc.c: New file.
* sysdeps/i386/tlsdesc.sym: New file.
* sysdeps/i386/bits/linkmap.h (struct link_map_machine): Add
tlsdesc_table.
* sysdeps/x86_64/Makefile (sysdep-dl-routines, sysdep_routines,
systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
* sysdeps/x86_64/dl-lookupcfg.h: New file. Introduce _dl_unmap to
release tlsdesc_table.
* sysdeps/x86_64/dl-machine.h: Include dl-tlsdesc.h.
(elf_machine_runtime_setup): Set up lazy TLSDESC GOT entry.
(elf_machine_type_class): Mark R_X86_64_TLSDESC as PLT class.
(elf_machine_rel): Handle R_X86_64_TLSDESC.
(elf_machine_rela): Likewise.
(elf_machine_lazy_rel): Likewise.
* sysdeps/x86_64/dl-tls.h (struct dl_tls_index): Name it.
(__tls_get_addr): Do not declare for non-shared compiles.
* sysdeps/x86_64/dl-tlsdesc.S: New file.
* sysdeps/x86_64/dl-tlsdesc.h: New file.
* sysdeps/x86_64/tlsdesc.c: New file.
* sysdeps/x86_64/tlsdesc.sym: New file.
* sysdeps/x86_64/bits/linkmap.h (struct link_map_machine): Add
tlsdesc_table for both 32- and 64-bit structs.
Index: elf/dl-conflict.c
===================================================================
--- elf/dl-conflict.c.orig 2006-01-13 18:14:11.000000000 -0500
+++ elf/dl-conflict.c 2006-01-13 18:16:22.000000000 -0500
@@ -45,6 +45,7 @@
#define RESOLVE_MAP(ref, version, flags) (*ref = NULL, NULL)
#define RESOLVE(ref, version, flags) (*ref = NULL, 0)
#define CHECK_STATIC_TLS(ref_map, sym_map) ((void) 0)
+#define TRY_STATIC_TLS(ref_map, sym_map) (0)
#define RESOLVE_CONFLICT_FIND_MAP(map, r_offset) \
do { \
while ((resolve_conflict_map->l_map_end < (ElfW(Addr)) (r_offset)) \
Index: elf/dl-reloc.c
===================================================================
--- elf/dl-reloc.c.orig 2006-01-13 18:16:22.000000000 -0500
+++ elf/dl-reloc.c 2006-01-13 18:16:22.000000000 -0500
@@ -44,9 +44,9 @@
This function intentionally does not return any value but signals error
directly, as static TLS should be rare and code handling it should
not be inlined as much as possible. */
-void
-internal_function __attribute_noinline__
-_dl_allocate_static_tls (struct link_map *map)
+int
+internal_function
+_dl_try_allocate_static_tls (struct link_map *map)
{
/* If we've already used the variable with dynamic access, or if the
alignment requirements are too high, fail. */
@@ -54,8 +54,7 @@
|| map->l_tls_align > GL(dl_tls_static_align))
{
fail:
- _dl_signal_error (0, map->l_name, NULL, N_("\
-cannot allocate memory in static TLS block"));
+ return -1;
}
# if TLS_TCB_AT_TP
@@ -109,6 +108,20 @@
}
else
map->l_need_tls_init = 1;
+
+ return 0;
+}
+
+void
+internal_function __attribute_noinline__
+_dl_allocate_static_tls (struct link_map *map)
+{
+ if (map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET
+ || _dl_try_allocate_static_tls (map))
+ {
+ _dl_signal_error (0, map->l_name, NULL, N_("\
+cannot allocate memory in static TLS block"));
+ }
}
/* Initialize static TLS area and DTV for current (only) thread.
@@ -267,6 +280,12 @@
_dl_allocate_static_tls (sym_map); \
} while (0)
+#define TRY_STATIC_TLS(map, sym_map) \
+ (__builtin_expect ((sym_map)->l_tls_offset \
+ != FORCED_DYNAMIC_TLS_OFFSET, 1) \
+ && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1) \
+ || _dl_try_allocate_static_tls (sym_map) == 0))
+
#include "dynamic-link.h"
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling);
Index: elf/elf.h
===================================================================
--- elf/elf.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ elf/elf.h 2006-01-13 18:16:22.000000000 -0500
@@ -699,6 +699,12 @@
If any adjustment is made to the ELF object after it has been
built these entries will need to be adjusted. */
#define DT_ADDRRNGLO 0x6ffffe00
+#define DT_TLSDESC_PLT 0x6ffffef6 /* Location of PLT entry for
+ TLS descriptor resolver
+ calls. */
+#define DT_TLSDESC_GOT 0x6ffffef7 /* Location of GOT entry used
+ by TLS descriptor resolver
+ PLT entry. */
#define DT_GNU_CONFLICT 0x6ffffef8 /* Start of conflict section */
#define DT_GNU_LIBLIST 0x6ffffef9 /* Library list */
#define DT_CONFIG 0x6ffffefa /* Configuration information. */
@@ -1136,8 +1142,17 @@
#define R_386_TLS_DTPMOD32 35 /* ID of module containing symbol */
#define R_386_TLS_DTPOFF32 36 /* Offset in TLS block */
#define R_386_TLS_TPOFF32 37 /* Negated offset in static TLS block */
+/* 38? */
+#define R_386_TLS_GOTDESC 39 /* GOT offset for TLS descriptor. */
+#define R_386_TLS_DESC_CALL 40 /* Marker of call through TLS
+ descriptor for
+ relaxation. */
+#define R_386_TLS_DESC 41 /* TLS descriptor containing
+ pointer to code and to
+ argument, returning the TLS
+ offset for the symbol. */
/* Keep this the last entry. */
-#define R_386_NUM 38
+#define R_386_NUM 42
/* SUN SPARC specific definitions. */
@@ -2509,8 +2524,17 @@
#define R_X86_64_GOTTPOFF 22 /* 32 bit signed PC relative offset
to GOT entry for IE symbol */
#define R_X86_64_TPOFF32 23 /* Offset in initial TLS block */
+#define R_X86_64_PC64 24 /* PC relative 64 bit */
+#define R_X86_64_GOTOFF64 25 /* 64 bit offset to GOT */
+#define R_X86_64_GOTPC32 26 /* 32 bit signed pc relative
+ offset to GOT */
+/* 27 .. 33 */
+#define R_X86_64_GOTPC32_TLSDESC 34 /* GOT offset for TLS descriptor. */
+#define R_X86_64_TLSDESC_CALL 35 /* Marker for call through TLS
+ descriptor. */
+#define R_X86_64_TLSDESC 36 /* TLS descriptor. */
-#define R_X86_64_NUM 24
+#define R_X86_64_NUM 37
/* AM33 relocations. */
Index: sysdeps/i386/Makefile
===================================================================
--- sysdeps/i386/Makefile.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/i386/Makefile 2006-01-13 18:16:22.000000000 -0500
@@ -65,3 +65,13 @@
ifneq (,$(filter -mno-tls-direct-seg-refs,$(CFLAGS)))
defines += -DNO_TLS_DIRECT_SEG_REFS
endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/i386/bits/linkmap.h
===================================================================
--- sysdeps/i386/bits/linkmap.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/i386/bits/linkmap.h 2006-01-13 18:16:22.000000000 -0500
@@ -2,4 +2,5 @@
{
Elf32_Addr plt; /* Address of .plt + 0x16 */
Elf32_Addr gotplt; /* Address of .got + 0x0c */
+ void *tlsdesc_table; /* Address of TLS descriptor hash table. */
};
Index: sysdeps/i386/dl-lookupcfg.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/dl-lookupcfg.h 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/i386/dl-machine.h
===================================================================
--- sysdeps/i386/dl-machine.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/i386/dl-machine.h 2006-01-13 18:16:22.000000000 -0500
@@ -25,6 +25,7 @@
#include <sys/param.h>
#include <sysdep.h>
#include <tls.h>
+#include <dl-tlsdesc.h>
/* Return nonzero iff ELF header is compatible with the running host. */
static inline int __attribute__ ((unused))
@@ -248,7 +249,7 @@
# define elf_machine_type_class(type) \
((((type) == R_386_JMP_SLOT || (type) == R_386_TLS_DTPMOD32 \
|| (type) == R_386_TLS_DTPOFF32 || (type) == R_386_TLS_TPOFF32 \
- || (type) == R_386_TLS_TPOFF) \
+ || (type) == R_386_TLS_TPOFF || (type) == R_386_TLS_DESC) \
* ELF_RTYPE_CLASS_PLT) \
| (((type) == R_386_COPY) * ELF_RTYPE_CLASS_COPY))
#else
@@ -375,6 +376,38 @@
*reloc_addr = sym->st_value;
# endif
break;
+ case R_386_TLS_DESC:
+ {
+ struct tlsdesc volatile *td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+ if (! sym)
+ td->entry = _dl_tlsdesc_undefweak;
+ else
+# endif
+ {
+# ifndef RTLD_BOOTSTRAP
+# ifndef SHARED
+ CHECK_STATIC_TLS (map, sym_map);
+# else
+ if (!TRY_STATIC_TLS (map, sym_map))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic
+ (sym_map, sym->st_value + (ElfW(Word))td->arg);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+# endif
+ {
+ td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+ + (ElfW(Word))td->arg);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+ break;
+ }
case R_386_TLS_TPOFF32:
/* The offset is positive, backward from the thread pointer. */
# ifdef RTLD_BOOTSTRAP
@@ -488,6 +521,41 @@
Therefore the offset is already correct. */
*reloc_addr = (sym == NULL ? 0 : sym->st_value) + reloc->r_addend;
break;
+ case R_386_TLS_DESC:
+ {
+ struct tlsdesc volatile *td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+ if (!sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+# endif
+ {
+# ifndef RTLD_BOOTSTRAP
+# ifndef SHARED
+ CHECK_STATIC_TLS (map, sym_map);
+# else
+ if (!TRY_STATIC_TLS (map, sym_map))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic
+ (sym_map, sym->st_value + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+# endif
+ {
+ td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+ }
+ break;
case R_386_TLS_TPOFF32:
/* The offset is positive, backward from the thread pointer. */
/* We know the offset of object the symbol is contained in.
@@ -582,6 +650,55 @@
*reloc_addr = (map->l_mach.plt
+ (((Elf32_Addr) reloc_addr) - map->l_mach.gotplt) * 4);
}
+#ifdef USE_TLS
+ else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+ {
+ struct tlsdesc volatile * __attribute__((__unused__)) td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+ /* Handle relocations that reference the local *ABS* in a simple
+ way, so as to preserve a potential addend. */
+ if (ELF32_R_SYM (reloc->r_info) == 0)
+ td->entry = _dl_tlsdesc_resolve_abs_plus_addend;
+ /* Given a known-zero addend, we can store a pointer to the
+ reloc in the arg position. */
+ else if (td->arg == 0)
+ {
+ td->arg = (void*)reloc;
+ td->entry = _dl_tlsdesc_resolve_rel;
+ }
+ else
+ {
+ /* We could handle non-*ABS* relocations with non-zero addends
+ by allocating dynamically an arg to hold a pointer to the
+ reloc, but that sounds pointless. */
+ const Elf32_Rel *const r = reloc;
+ /* The code below was borrowed from elf_dynamic_do_rel(). */
+ const ElfW(Sym) *const symtab =
+ (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+
+#ifdef RTLD_BOOTSTRAP
+ /* The dynamic linker always uses versioning. */
+ assert (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL);
+#else
+ if (map->l_info[VERSYMIDX (DT_VERSYM)])
+#endif
+ {
+ const ElfW(Half) *const version =
+ (const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
+ elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
+ &map->l_versions[ndx],
+ (void *) (l_addr + r->r_offset));
+ }
+#ifndef RTLD_BOOTSTRAP
+ else
+ elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
+ (void *) (l_addr + r->r_offset));
+#endif
+ }
+ }
+#endif
else
_dl_reloc_bad_type (map, r_type, 1);
}
@@ -593,6 +710,22 @@
elf_machine_lazy_rela (struct link_map *map,
Elf32_Addr l_addr, const Elf32_Rela *reloc)
{
+#ifdef USE_TLS
+ Elf32_Addr *const reloc_addr = (void *) (l_addr + reloc->r_offset);
+ const unsigned int r_type = ELF32_R_TYPE (reloc->r_info);
+ if (__builtin_expect (r_type == R_386_JMP_SLOT, 1))
+ ;
+ else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+ {
+ struct tlsdesc volatile * __attribute__((__unused__)) td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+ td->arg = (void*)reloc;
+ td->entry = _dl_tlsdesc_resolve_rela;
+ }
+ else
+ _dl_reloc_bad_type (map, r_type, 1);
+#endif
}
#endif /* !RTLD_BOOTSTRAP */
Index: sysdeps/i386/dl-tls.h
===================================================================
--- sysdeps/i386/dl-tls.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/i386/dl-tls.h 2006-01-13 18:16:22.000000000 -0500
@@ -19,7 +19,7 @@
/* Type used for the representation of TLS information in the GOT. */
-typedef struct
+typedef struct dl_tls_index
{
unsigned long int ti_module;
unsigned long int ti_offset;
Index: sysdeps/i386/dl-tlsdesc.S
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/dl-tlsdesc.S 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,228 @@
+/* Thread-local storage handling in the ELF dynamic linker. i386 version.
+ Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+ .text
+#ifdef USE_TLS
+ .hidden _dl_tlsdesc_return
+ .global _dl_tlsdesc_return
+ .type _dl_tlsdesc_return,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_return:
+ movl 4(%eax), %eax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+ .hidden _dl_tlsdesc_undefweak
+ .global _dl_tlsdesc_undefweak
+ .type _dl_tlsdesc_undefweak,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_undefweak:
+ movl 4(%eax), %eax
+ subl %gs:0, %eax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_undefweak, .-_dl_tlsdesc_undefweak
+
+#ifdef SHARED
+ .hidden _dl_tlsdesc_dynamic
+ .global _dl_tlsdesc_dynamic
+ .type _dl_tlsdesc_dynamic,@function
+
+ /* %eax points to the TLS descriptor, such that 0(%eax) points to
+ _dl_tlsdesc_dynamic itself, and 4(%eax) points to a struct
+ tlsdesc_dynamic_arg object. It must return in %eax the offset
+ between the thread pointer and the object denoted by the
+ argument, without clobbering any registers.
+
+ The assembly code that follows is a rendition of the following
+ C code, hand-optimized a little bit.
+
+ptrdiff_t
+__attribute__ ((__regparm__ (1)))
+_dl_tlsdesc_dynamic (struct tlsdesc *tdp)
+{
+ struct tlsdesc_dynamic_arg *td = tdp->arg;
+ dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+ if (__builtin_expect (td->gen_count <= dtv[0].counter
+ && (dtv[td->tlsinfo.ti_module].pointer.val
+ != TLS_DTV_UNALLOCATED),
+ 1))
+ return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+ - __thread_pointer;
+
+ return ___tls_get_addr (&td->tlsinfo) - __thread_pointer;
+}
+*/
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_dynamic:
+ /* Preserve call-clobbered registers.
+ We need two scratch regs anyway.
+ FIXME: maybe remove the requirement to preserve them? */
+ subl $28, %esp
+ cfi_adjust_cfa_offset (28)
+ movl %ecx, 20(%esp)
+ movl %edx, 24(%esp)
+ movl TLSDESC_ARG(%eax), %eax
+ movl %gs:DTV_OFFSET, %edx
+ movl TLSDESC_GEN_COUNT(%eax), %ecx
+ cmpl (%edx), %ecx
+ ja .Lslow
+ movl TLSDESC_MODID(%eax), %ecx
+ movl (%edx,%ecx,8), %edx
+ cmpl $-1, %edx
+ je .Lslow
+ movl TLSDESC_MODOFF(%eax), %eax
+ addl %edx, %eax
+.Lret:
+ movl 20(%esp), %ecx
+ subl %gs:0, %eax
+ movl 24(%esp), %edx
+ addl $28, %esp
+ cfi_adjust_cfa_offset (-28)
+ ret
+ .p2align 4,,7
+.Lslow:
+ cfi_adjust_cfa_offset (28)
+ movl %ebx, 16(%esp)
+ call __i686.get_pc_thunk.bx
+ addl $_GLOBAL_OFFSET_TABLE_, %ebx
+ call ___tls_get_addr@PLT
+ movl 16(%esp), %ebx
+ jmp .Lret
+ cfi_endproc
+ .size _dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+
+ .hidden _dl_tlsdesc_resolve_abs_plus_addend
+ .global _dl_tlsdesc_resolve_abs_plus_addend
+ .type _dl_tlsdesc_resolve_abs_plus_addend,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_abs_plus_addend:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_abs_plus_addend_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_abs_plus_addend, .-_dl_tlsdesc_resolve_abs_plus_addend
+
+ .hidden _dl_tlsdesc_resolve_rel
+ .global _dl_tlsdesc_resolve_rel
+ .type _dl_tlsdesc_resolve_rel,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_rel:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_rel_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_rel, .-_dl_tlsdesc_resolve_rel
+
+ .hidden _dl_tlsdesc_resolve_rela
+ .global _dl_tlsdesc_resolve_rela
+ .type _dl_tlsdesc_resolve_rela,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_rela:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_rela_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+ .hidden _dl_tlsdesc_resolve_hold
+ .global _dl_tlsdesc_resolve_hold
+ .type _dl_tlsdesc_resolve_hold,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_hold:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_hold_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
+
+#endif /* USE_TLS */
Index: sysdeps/i386/dl-tlsdesc.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/dl-tlsdesc.h 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,60 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+ i386 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#ifndef _I386_DL_TLSDESC_H
+# define _I386_DL_TLSDESC_H 1
+
+/* Type used to represent a TLS descriptor in the GOT. */
+struct tlsdesc
+{
+ ptrdiff_t __attribute__((regparm(1))) (*entry)(struct tlsdesc *);
+ void *arg;
+};
+
+typedef struct dl_tls_index
+{
+ unsigned long int ti_module;
+ unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+ needs dynamic TLS offsets. */
+struct tlsdesc_dynamic_arg
+{
+ tls_index tlsinfo;
+ size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+ _dl_tlsdesc_return(struct tlsdesc *),
+ _dl_tlsdesc_undefweak(struct tlsdesc *),
+ _dl_tlsdesc_resolve_abs_plus_addend(struct tlsdesc *),
+ _dl_tlsdesc_resolve_rel(struct tlsdesc *),
+ _dl_tlsdesc_resolve_rela(struct tlsdesc *),
+ _dl_tlsdesc_resolve_hold(struct tlsdesc *);
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+ _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/i386/tlsdesc.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/tlsdesc.c 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,673 @@
+/* Manage TLS descriptors. i386 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+
+#ifdef USE_TLS
+# ifdef SHARED
+
+extern void weak_function free (void *ptr);
+
+/* The hashcode handling code below is heavily inspired in libiberty's
+ hashtab code, but with most adaptation points and support for
+ deleting elements removed.
+
+ Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+ Contributed by Vladimir Makarov (vmakarov@cygnus.com). */
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+ /* These are primes that are near, but slightly smaller than, a
+ power of two. */
+ static const unsigned long primes[] = {
+ (unsigned long) 7,
+ (unsigned long) 13,
+ (unsigned long) 31,
+ (unsigned long) 61,
+ (unsigned long) 127,
+ (unsigned long) 251,
+ (unsigned long) 509,
+ (unsigned long) 1021,
+ (unsigned long) 2039,
+ (unsigned long) 4093,
+ (unsigned long) 8191,
+ (unsigned long) 16381,
+ (unsigned long) 32749,
+ (unsigned long) 65521,
+ (unsigned long) 131071,
+ (unsigned long) 262139,
+ (unsigned long) 524287,
+ (unsigned long) 1048573,
+ (unsigned long) 2097143,
+ (unsigned long) 4194301,
+ (unsigned long) 8388593,
+ (unsigned long) 16777213,
+ (unsigned long) 33554393,
+ (unsigned long) 67108859,
+ (unsigned long) 134217689,
+ (unsigned long) 268435399,
+ (unsigned long) 536870909,
+ (unsigned long) 1073741789,
+ (unsigned long) 2147483647,
+ /* 4294967291L */
+ ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+ };
+
+ const unsigned long *low = &primes[0];
+ const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+ while (low != high)
+ {
+ const unsigned long *mid = low + (high - low) / 2;
+ if (n > *mid)
+ low = mid + 1;
+ else
+ high = mid;
+ }
+
+#if 0
+ /* If we've run out of primes, abort. */
+ if (n > *low)
+ {
+ fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+ abort ();
+ }
+#endif
+
+ return *low;
+}
+
+struct hashtab
+{
+ /* Table itself. */
+ void **entries;
+
+ /* Current size (in entries) of the hash table */
+ size_t size;
+
+ /* Current number of elements. */
+ size_t n_elements;
+};
+
+inline static struct hashtab *
+htab_create (void)
+{
+ struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+ if (! ht)
+ return NULL;
+ ht->size = 3;
+ ht->entries = malloc (sizeof (void *) * ht->size);
+ if (! ht->entries)
+ return NULL;
+
+ ht->n_elements = 0;
+
+ memset (ht->entries, 0, sizeof (void *) * ht->size);
+
+ return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+ free(). See the discussion below. */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+ int i;
+
+ for (i = htab->size - 1; i >= 0; i--)
+ if (htab->entries[i])
+ free (htab->entries[i]);
+
+ free (htab->entries);
+ free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+ - Does not call htab->eq_f when it finds an existing entry.
+ - Does not change the count of elements/searches/collisions in the
+ hash table.
+ This function also assumes there are no deleted entries in the table.
+ HASH is the hash value for the element to be inserted. */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+ size_t size = htab->size;
+ unsigned int index = hash % size;
+ void **slot = htab->entries + index;
+ int hash2;
+
+ if (! *slot)
+ return slot;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ slot = htab->entries + index;
+ if (! *slot)
+ return slot;
+ }
+}
+
+/* The following function changes size of memory allocated for the
+ entries and repeatedly inserts the table elements. The occupancy
+ of the table after the call will be about 50%. Naturally the hash
+ table must already exist. Remember also that the place of the
+ table entries is changed. If memory allocation failures are allowed,
+ this function will return zero, indicating that the table could not be
+ expanded. If all goes well, it will return a non-zero value. */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+ void **oentries;
+ void **olimit;
+ void **p;
+ void **nentries;
+ size_t nsize;
+
+ oentries = htab->entries;
+ olimit = oentries + htab->size;
+
+ /* Resize only when table after removal of unused elements is either
+ too full or too empty. */
+ if (htab->n_elements * 2 > htab->size)
+ nsize = higher_prime_number (htab->n_elements * 2);
+ else
+ nsize = htab->size;
+
+ nentries = malloc (sizeof (void *) * nsize);
+ memset (nentries, 0, sizeof (void *) * nsize);
+ if (nentries == NULL)
+ return 0;
+ htab->entries = nentries;
+ htab->size = nsize;
+
+ p = oentries;
+ do
+ {
+ if (*p)
+ *find_empty_slot_for_expand (htab, hash_fn (*p))
+ = *p;
+
+ p++;
+ }
+ while (p < olimit);
+
+#if 0 /* We can't tell whether this was allocated by the malloc()
+ built into ld.so or the one in the main executable or libc,
+ and calling free() for something that wasn't malloc()ed could
+ do Very Bad Things (TM). Take the conservative approach
+ here, potentially wasting as much memory as actually used by
+ the hash table, even if multiple growths occur. That's not
+ so bad as to require some overengineered solution that would
+ enable us to keep track of how it was allocated. */
+ free (oentries);
+#endif
+ return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+ equal to the given element. To delete an entry, call this with
+ INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+ after doing some checks). To insert an entry, call this with
+ INSERT = 1, then write the value you want into the returned slot.
+ When inserting an entry, NULL may be returned if memory allocation
+ fails. */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+ int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+ unsigned int index;
+ int hash, hash2;
+ size_t size;
+ void **entry;
+
+ if (htab->size * 3 <= htab->n_elements * 4
+ && htab_expand (htab, hash_fn) == 0)
+ return NULL;
+
+ hash = hash_fn (ptr);
+
+ size = htab->size;
+ index = hash % size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+ }
+
+ empty_entry:
+ if (!insert)
+ return NULL;
+
+ htab->n_elements++;
+ return entry;
+}
+
+inline static int
+hash_tlsdesc(void *p)
+{
+ struct tlsdesc_dynamic_arg *td = p;
+
+ /* We know all entries are for the same module, so ti_offset is the
+ only distinguishing entry. */
+ return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+ struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+ return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+ size_t idx = map->l_tls_modid;
+ struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+ /* Find the place in the dtv slotinfo list. */
+ do
+ {
+ /* Does it fit in the array of this list element? */
+ if (idx < listp->len)
+ {
+ /* We should never get here for a module in static TLS, so
+ we can assume that, if the generation count is zero, we
+ still haven't determined the generation count for this
+ module. */
+ if (listp->slotinfo[idx].gen)
+ return listp->slotinfo[idx].gen;
+ else
+ break;
+ }
+ idx -= listp->len;
+ listp = listp->next;
+ }
+ while (listp != NULL);
+
+ /* If we get to this point, the module still hasn't been assigned an
+ entry in the dtv slotinfo data structures, and it will when we're
+ done with relocations. At that point, the module will get a
+ generation number that is one past the current generation, so
+ return exactly that. */
+ return GL(dl_tls_generation) + 1;
+}
+
+void *
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+ struct hashtab *ht;
+ void **entry;
+ struct tlsdesc_dynamic_arg *td, test;
+
+ /* FIXME: We could use a per-map lock here, but is it worth it? */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+ ht = map->l_mach.tlsdesc_table;
+ if (! ht)
+ {
+ ht = htab_create ();
+ if (! ht)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 0;
+ }
+ map->l_mach.tlsdesc_table = ht;
+ }
+
+ test.tlsinfo.ti_module = map->l_tls_modid;
+ test.tlsinfo.ti_offset = ti_offset;
+ entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+ if (*entry)
+ {
+ td = *entry;
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+ }
+
+ *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+ /* This may be higher than the map's generation, but it doesn't
+ matter much. Worst case, we'll have one extra DTV update per
+ thread. */
+ td->gen_count = map_generation (map);
+ td->tlsinfo = test.tlsinfo;
+
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+ from attempting to resolve the same TLS descriptor without busy
+ waiting. Ideally, we should be able to release the lock right
+ after changing td->entry, and then using say a condition variable
+ or a futex wake to wake up any waiting threads, but let's try to
+ avoid introducing such dependencies. */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+ if (caller != td->entry)
+ return 1;
+
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ if (caller != td->entry)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 1;
+ }
+
+ td->entry = _dl_tlsdesc_resolve_hold;
+
+ return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* The following 4 functions take an entry_check_offset argument.
+ It's computed by the caller as an offset between its entry point
+ and the call site, such that by adding the built-in return address
+ that is implicitly passed to the function with this offset, we can
+ easily obtain the caller's entry point to compare with the entry
+ point given in the TLS descriptor. If it's changed, we want to
+ return immediately. */
+
+/* These macros are copied from elf/dl-reloc.c */
+
+#define CHECK_STATIC_TLS(map, sym_map) \
+ do { \
+ if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET \
+ || ((sym_map)->l_tls_offset \
+ == FORCED_DYNAMIC_TLS_OFFSET), 0)) \
+ _dl_allocate_static_tls (sym_map); \
+ } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map) \
+ (__builtin_expect ((sym_map)->l_tls_offset \
+ != FORCED_DYNAMIC_TLS_OFFSET, 1) \
+ && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1) \
+ || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+ that reference the *ABS* segment in their own link maps. The
+ argument is the addend originally stored there. */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_abs_plus_addend_fixup (struct tlsdesc volatile *td,
+ struct link_map *l,
+ ptrdiff_t entry_check_offset)
+{
+ ptrdiff_t addend = (ptrdiff_t) td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+ - entry_check_offset))
+ return;
+
+#ifndef SHARED
+ CHECK_STATIC_TLS (l, l);
+#else
+ if (!TRY_STATIC_TLS (l, l))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (l, addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+#endif
+ {
+ td->arg = (void*)(addend - l->l_tls_offset);
+ td->entry = _dl_tlsdesc_return;
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+ that originally had zero addends. The argument location, that
+ originally held the addend, is used to hold a pointer to the
+ relocation, but it has to be restored before we call the function
+ that applies relocations. */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rel_fixup (struct tlsdesc volatile *td,
+ struct link_map *l,
+ ptrdiff_t entry_check_offset)
+{
+ const ElfW(Rel) *reloc = td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+ - entry_check_offset))
+ return;
+
+ /* The code below was borrowed from _dl_fixup(),
+ except for checking for STB_LOCAL. */
+ const ElfW(Sym) *const symtab
+ = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+ const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+ lookup_t result;
+
+ /* Look up the target symbol. If the normal lookup rules are not
+ used don't look in the global scope. */
+ if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+ && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+ {
+ const struct r_found_version *version = NULL;
+
+ if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+ {
+ const ElfW(Half) *vernum =
+ (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+ version = &l->l_versions[ndx];
+ if (version->hash == 0)
+ version = NULL;
+ }
+
+ result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+ l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+ DL_LOOKUP_ADD_DEPENDENCY, NULL);
+ }
+ else
+ {
+ /* We already found the symbol. The module (and therefore its load
+ address) is also known. */
+ result = l;
+ }
+
+ if (!sym)
+ {
+ td->arg = 0;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+ {
+# ifndef SHARED
+ CHECK_STATIC_TLS (l, result);
+# else
+ if (!TRY_STATIC_TLS (l, result))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+ {
+ td->arg = (void*)(sym->st_value - result->l_tls_offset);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+ The argument location is used to hold a pointer to the relocation. */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+ struct link_map *l,
+ ptrdiff_t entry_check_offset)
+{
+ const ElfW(Rela) *reloc = td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+ - entry_check_offset))
+ return;
+
+ /* The code below was borrowed from _dl_fixup(),
+ except for checking for STB_LOCAL. */
+ const ElfW(Sym) *const symtab
+ = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+ const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+ lookup_t result;
+
+ /* Look up the target symbol. If the normal lookup rules are not
+ used don't look in the global scope. */
+ if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+ && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+ {
+ const struct r_found_version *version = NULL;
+
+ if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+ {
+ const ElfW(Half) *vernum =
+ (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+ version = &l->l_versions[ndx];
+ if (version->hash == 0)
+ version = NULL;
+ }
+
+ result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+ l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+ DL_LOOKUP_ADD_DEPENDENCY, NULL);
+ }
+ else
+ {
+ /* We already found the symbol. The module (and therefore its load
+ address) is also known. */
+ result = l;
+ }
+
+ if (!sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+ {
+# ifndef SHARED
+ CHECK_STATIC_TLS (l, result);
+# else
+ if (!TRY_STATIC_TLS (l, result))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+ {
+ td->arg = (void*)(sym->st_value - result->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+ struct link_map *l __attribute__((__unused__)),
+ ptrdiff_t entry_check_offset)
+{
+ /* Maybe we're lucky and can return early. */
+ if (__builtin_return_address (0) - entry_check_offset != td->entry)
+ return;
+
+ /* Locking here will stop execution until the runnign resolver runs
+ _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+ FIXME: We'd be better off waiting on a condition variable, such
+ that we didn't have to hold the lock throughout the relocation
+ processing. */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif /* USE_TLS */
+
+void
+_dl_unmap (struct link_map *map)
+{
+ __munmap ((void *) (map)->l_map_start,
+ (map)->l_map_end - (map)->l_map_start);
+
+#if USE_TLS && SHARED
+ /* _dl_unmap is only called for dlopen()ed libraries, for which
+ calling free() is safe, or before we've completed the initial
+ relocation, in which case calling free() is probably pointless,
+ but still safe. */
+ if (map->l_mach.tlsdesc_table)
+ htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/i386/tlsdesc.sym
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/tlsdesc.sym 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,20 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+#if defined USE_TLS
+
+DTV_OFFSET offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+
+#endif
Index: sysdeps/x86_64/Makefile
===================================================================
--- sysdeps/x86_64/Makefile.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/x86_64/Makefile 2006-01-13 18:16:22.000000000 -0500
@@ -9,3 +9,13 @@
ifeq ($(subdir),gmon)
sysdep_routines += _mcount
endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/x86_64/bits/linkmap.h
===================================================================
--- sysdeps/x86_64/bits/linkmap.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/x86_64/bits/linkmap.h 2006-01-13 18:16:22.000000000 -0500
@@ -3,6 +3,7 @@
{
Elf64_Addr plt; /* Address of .plt + 0x16 */
Elf64_Addr gotplt; /* Address of .got + 0x18 */
+ void *tlsdesc_table; /* Address of TLS descriptor hash table. */
};
#else
@@ -10,5 +11,6 @@
{
Elf32_Addr plt; /* Address of .plt + 0x16 */
Elf32_Addr gotplt; /* Address of .got + 0x0c */
+ void *tlsdesc_table; /* Address of TLS descriptor hash table. */
};
#endif
Index: sysdeps/x86_64/dl-lookupcfg.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/dl-lookupcfg.h 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/x86_64/dl-machine.h
===================================================================
--- sysdeps/x86_64/dl-machine.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/x86_64/dl-machine.h 2006-01-13 18:16:22.000000000 -0500
@@ -26,6 +26,7 @@
#include <sys/param.h>
#include <sysdep.h>
#include <tls.h>
+#include <dl-tlsdesc.h>
/* Return nonzero iff ELF header is compatible with the running host. */
static inline int __attribute__ ((unused))
@@ -131,6 +132,10 @@
got[2] = (Elf64_Addr) &_dl_runtime_resolve;
}
+ if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy)
+ *(Elf64_Addr*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr)
+ = (Elf64_Addr) &_dl_tlsdesc_resolve_rela;
+
return lazy;
}
@@ -194,7 +199,9 @@
# define elf_machine_type_class(type) \
((((type) == R_X86_64_JUMP_SLOT \
|| (type) == R_X86_64_DTPMOD64 \
- || (type) == R_X86_64_DTPOFF64 || (type) == R_X86_64_TPOFF64) \
+ || (type) == R_X86_64_DTPOFF64 \
+ || (type) == R_X86_64_TPOFF64 \
+ || (type) == R_X86_64_TLSDESC) \
* ELF_RTYPE_CLASS_PLT) \
| (((type) == R_X86_64_COPY) * ELF_RTYPE_CLASS_COPY))
#else
@@ -323,6 +330,41 @@
*reloc_addr = sym->st_value + reloc->r_addend;
# endif
break;
+ case R_X86_64_TLSDESC:
+ {
+ struct tlsdesc volatile *td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+ if (! sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+# endif
+ {
+# ifndef RTLD_BOOTSTRAP
+# ifndef SHARED
+ CHECK_STATIC_TLS (map, sym_map);
+# else
+ if (!TRY_STATIC_TLS (map, sym_map))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic
+ (sym_map, sym->st_value + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+# endif
+ {
+ td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+ break;
+ }
case R_X86_64_TPOFF64:
/* The offset is negative, forward from the thread pointer. */
# ifndef RTLD_BOOTSTRAP
@@ -435,6 +477,15 @@
map->l_mach.plt
+ (((Elf64_Addr) reloc_addr) - map->l_mach.gotplt) * 2;
}
+ else if (__builtin_expect (r_type == R_X86_64_TLSDESC, 1))
+ {
+ struct tlsdesc volatile * __attribute__((__unused__)) td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+ td->arg = (void*)reloc;
+ td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
+ + map->l_addr);
+ }
else
_dl_reloc_bad_type (map, r_type, 1);
}
Index: sysdeps/x86_64/dl-tls.h
===================================================================
--- sysdeps/x86_64/dl-tls.h.orig 2006-01-13 18:14:11.000000000 -0500
+++ sysdeps/x86_64/dl-tls.h 2006-01-13 18:16:22.000000000 -0500
@@ -1,5 +1,5 @@
/* Thread-local storage handling in the ELF dynamic linker. x86-64 version.
- Copyright (C) 2002 Free Software Foundation, Inc.
+ Copyright (C) 2002, 2005 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -19,11 +19,13 @@
/* Type used for the representation of TLS information in the GOT. */
-typedef struct
+typedef struct dl_tls_index
{
unsigned long int ti_module;
unsigned long int ti_offset;
} tls_index;
+#ifdef SHARED
extern void *__tls_get_addr (tls_index *ti);
+#endif
Index: sysdeps/x86_64/dl-tlsdesc.S
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/dl-tlsdesc.S 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,203 @@
+/* Thread-local storage handling in the ELF dynamic linker. x86_64 version.
+ Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+ .text
+#ifdef USE_TLS
+ .hidden _dl_tlsdesc_return
+ .global _dl_tlsdesc_return
+ .type _dl_tlsdesc_return,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_return:
+ movq 8(%rax), %rax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+ .hidden _dl_tlsdesc_undefweak
+ .global _dl_tlsdesc_undefweak
+ .type _dl_tlsdesc_undefweak,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_undefweak:
+ movq 8(%rax), %rax
+ subq %fs:0, %rax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_undefweak, .-_dl_tlsdesc_undefweak
+
+#ifdef SHARED
+ .hidden _dl_tlsdesc_dynamic
+ .global _dl_tlsdesc_dynamic
+ .type _dl_tlsdesc_dynamic,@function
+
+ /* %rax points to the TLS descriptor, such that 0(%rax) points to
+ _dl_tlsdesc_dynamic itself, and 8(%rax) points to a struct
+ tlsdesc_dynamic_arg object. It must return in %rax the offset
+ between the thread pointer and the object denoted by the
+ argument, without clobbering any registers.
+
+ The assembly code that follows is a rendition of the following
+ C code, hand-optimized a little bit.
+
+ptrdiff_t
+_dl_tlsdesc_dynamic (register struct tlsdesc *tdp asm ("%rax"))
+{
+ struct tlsdesc_dynamic_arg *td = tdp->arg;
+ dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+ if (__builtin_expect (td->gen_count <= dtv[0].counter
+ && (dtv[td->tlsinfo.ti_module].pointer.val
+ != TLS_DTV_UNALLOCATED),
+ 1))
+ return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+ - __thread_pointer;
+
+ return __tls_get_addr_internal (&td->tlsinfo) - __thread_pointer;
+}
+*/
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_dynamic:
+ /* Preserve call-clobbered registers that we modify.
+ We need two scratch regs anyway. */
+ movq %rsi, -16(%rsp)
+ movq %fs:DTV_OFFSET, %rsi
+ movq %rdi, -8(%rsp)
+ movq TLSDESC_ARG(%rax), %rdi
+ movq (%rsi), %rax
+ cmpq %rax, TLSDESC_GEN_COUNT(%rdi)
+ ja .Lslow
+ movq TLSDESC_MODID(%rdi), %rax
+ salq $4, %rax
+ movq (%rax,%rsi), %rax
+ cmpq $-1, %rax
+ je .Lslow
+ addq TLSDESC_MODOFF(%rdi), %rax
+.Lret:
+ movq -16(%rsp), %rsi
+ subq %fs:0, %rax
+ movq -8(%rsp), %rdi
+ ret
+.Lslow:
+ /* Besides rdi and rsi, saved above, save rdx, rcx, r8, r9,
+ r10 and r11. Also, align the stack, that's off by 8 bytes. */
+ subq $72, %rsp
+ cfi_adjust_cfa_offset (72)
+ movq %rdx, 8(%rsp)
+ movq %rcx, 16(%rsp)
+ movq %r8, 24(%rsp)
+ movq %r9, 32(%rsp)
+ movq %r10, 40(%rsp)
+ movq %r11, 48(%rsp)
+ /* %rdi already points to the tlsinfo data structure. */
+ call __tls_get_addr@PLT
+ movq 8(%rsp), %rdx
+ movq 16(%rsp), %rcx
+ movq 24(%rsp), %r8
+ movq 32(%rsp), %r9
+ movq 40(%rsp), %r10
+ movq 48(%rsp), %r11
+ addq $72, %rsp
+ cfi_adjust_cfa_offset (-72)
+ jmp .Lret
+ cfi_endproc
+ .size _dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+
+ .hidden _dl_tlsdesc_resolve_rela
+ .global _dl_tlsdesc_resolve_rela
+ .type _dl_tlsdesc_resolve_rela,@function
+ cfi_startproc
+ .align 16
+ /* The PLT entry will have pushed the link_map pointer. */
+ cfi_adjust_cfa_offset (8)
+_dl_tlsdesc_resolve_rela:
+ /* Save all call-clobbered registers. */
+ subq $72, %rsp
+ cfi_adjust_cfa_offset (72)
+ movq %rax, (%rsp)
+ movq %rdi, 8(%rsp)
+ movq %rax, %rdi /* Pass tlsdesc* in %rdi. */
+ movq %rsi, 16(%rsp)
+ movq 72(%rsp), %rsi /* Pass link_map* in %rsi. */
+ movq %r8, 24(%rsp)
+ movq %r9, 32(%rsp)
+ movq %r10, 40(%rsp)
+ movq %r11, 48(%rsp)
+ movq %rdx, 56(%rsp)
+ movq %rcx, 64(%rsp)
+ call _dl_tlsdesc_resolve_rela_fixup
+ movq (%rsp), %rax
+ movq 8(%rsp), %rdi
+ movq 16(%rsp), %rsi
+ movq 24(%rsp), %r8
+ movq 32(%rsp), %r9
+ movq 40(%rsp), %r10
+ movq 48(%rsp), %r11
+ movq 56(%rsp), %rdx
+ movq 64(%rsp), %rcx
+ addq $80, %rsp
+ cfi_adjust_cfa_offset (-80)
+ jmp *(%rax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+ .hidden _dl_tlsdesc_resolve_hold
+ .global _dl_tlsdesc_resolve_hold
+ .type _dl_tlsdesc_resolve_hold,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_hold:
+0:
+ /* Save all call-clobbered registers. */
+ subq $72, %rsp
+ cfi_adjust_cfa_offset (72)
+ movq %rax, (%rsp)
+ movq %rdi, 8(%rsp)
+ movq %rax, %rdi /* Pass tlsdesc* in %rdi. */
+ movq %rsi, 16(%rsp)
+ movq $1f - 0b, %rsi /* Pass return address offset in %rsi. */
+ movq %r8, 24(%rsp)
+ movq %r9, 32(%rsp)
+ movq %r10, 40(%rsp)
+ movq %r11, 48(%rsp)
+ movq %rdx, 56(%rsp)
+ movq %rcx, 64(%rsp)
+ call _dl_tlsdesc_resolve_hold_fixup
+1:
+ movq (%rsp), %rax
+ movq 8(%rsp), %rdi
+ movq 16(%rsp), %rsi
+ movq 24(%rsp), %r8
+ movq 32(%rsp), %r9
+ movq 40(%rsp), %r10
+ movq 48(%rsp), %r11
+ movq 56(%rsp), %rdx
+ movq 64(%rsp), %rcx
+ addq $72, %rsp
+ cfi_adjust_cfa_offset (-72)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
+
+#endif /* USE_TLS */
Index: sysdeps/x86_64/dl-tlsdesc.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/dl-tlsdesc.h 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,63 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+ x86_64 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#ifndef _X86_64_DL_TLSDESC_H
+# define _X86_64_DL_TLSDESC_H 1
+
+/* Use this to access DT_TLSDESC_PLT and DT_TLSDESC_GOT. */
+#ifndef ADDRIDX
+# define ADDRIDX(tag) (DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGNUM \
+ + DT_EXTRANUM + DT_VALNUM + DT_ADDRTAGIDX (tag))
+#endif
+
+/* Type used to represent a TLS descriptor in the GOT. */
+struct tlsdesc
+{
+ ptrdiff_t (*entry)(struct tlsdesc *on_rax);
+ void *arg;
+};
+
+typedef struct dl_tls_index
+{
+ unsigned long int ti_module;
+ unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+ needs dynamic TLS offsets. */
+struct tlsdesc_dynamic_arg
+{
+ tls_index tlsinfo;
+ size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden
+ _dl_tlsdesc_return(struct tlsdesc *on_rax),
+ _dl_tlsdesc_undefweak(struct tlsdesc *on_rax),
+ _dl_tlsdesc_resolve_rela(struct tlsdesc *on_rax),
+ _dl_tlsdesc_resolve_hold(struct tlsdesc *on_rax);
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/x86_64/tlsdesc.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/tlsdesc.c 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,556 @@
+/* Manage TLS descriptors. x86_64 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+
+#ifdef USE_TLS
+# ifdef SHARED
+
+extern void weak_function free (void *ptr);
+
+/* The hashcode handling code below is heavily inspired in libiberty's
+ hashtab code, but with most adaptation points and support for
+ deleting elements removed.
+
+ Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+ Contributed by Vladimir Makarov (vmakarov@cygnus.com). */
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+ /* These are primes that are near, but slightly smaller than, a
+ power of two. */
+ static const unsigned long primes[] = {
+ (unsigned long) 7,
+ (unsigned long) 13,
+ (unsigned long) 31,
+ (unsigned long) 61,
+ (unsigned long) 127,
+ (unsigned long) 251,
+ (unsigned long) 509,
+ (unsigned long) 1021,
+ (unsigned long) 2039,
+ (unsigned long) 4093,
+ (unsigned long) 8191,
+ (unsigned long) 16381,
+ (unsigned long) 32749,
+ (unsigned long) 65521,
+ (unsigned long) 131071,
+ (unsigned long) 262139,
+ (unsigned long) 524287,
+ (unsigned long) 1048573,
+ (unsigned long) 2097143,
+ (unsigned long) 4194301,
+ (unsigned long) 8388593,
+ (unsigned long) 16777213,
+ (unsigned long) 33554393,
+ (unsigned long) 67108859,
+ (unsigned long) 134217689,
+ (unsigned long) 268435399,
+ (unsigned long) 536870909,
+ (unsigned long) 1073741789,
+ (unsigned long) 2147483647,
+ /* 4294967291L */
+ ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+ };
+
+ const unsigned long *low = &primes[0];
+ const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+ while (low != high)
+ {
+ const unsigned long *mid = low + (high - low) / 2;
+ if (n > *mid)
+ low = mid + 1;
+ else
+ high = mid;
+ }
+
+#if 0
+ /* If we've run out of primes, abort. */
+ if (n > *low)
+ {
+ fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+ abort ();
+ }
+#endif
+
+ return *low;
+}
+
+struct hashtab
+{
+ /* Table itself. */
+ void **entries;
+
+ /* Current size (in entries) of the hash table */
+ size_t size;
+
+ /* Current number of elements. */
+ size_t n_elements;
+};
+
+inline static struct hashtab *
+htab_create (void)
+{
+ struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+ if (! ht)
+ return NULL;
+ ht->size = 3;
+ ht->entries = malloc (sizeof (void *) * ht->size);
+ if (! ht->entries)
+ return NULL;
+
+ ht->n_elements = 0;
+
+ memset (ht->entries, 0, sizeof (void *) * ht->size);
+
+ return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+ free(). See the discussion below. */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+ int i;
+
+ for (i = htab->size - 1; i >= 0; i--)
+ if (htab->entries[i])
+ free (htab->entries[i]);
+
+ free (htab->entries);
+ free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+ - Does not call htab->eq_f when it finds an existing entry.
+ - Does not change the count of elements/searches/collisions in the
+ hash table.
+ This function also assumes there are no deleted entries in the table.
+ HASH is the hash value for the element to be inserted. */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+ size_t size = htab->size;
+ unsigned int index = hash % size;
+ void **slot = htab->entries + index;
+ int hash2;
+
+ if (! *slot)
+ return slot;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ slot = htab->entries + index;
+ if (! *slot)
+ return slot;
+ }
+}
+
+/* The following function changes size of memory allocated for the
+ entries and repeatedly inserts the table elements. The occupancy
+ of the table after the call will be about 50%. Naturally the hash
+ table must already exist. Remember also that the place of the
+ table entries is changed. If memory allocation failures are allowed,
+ this function will return zero, indicating that the table could not be
+ expanded. If all goes well, it will return a non-zero value. */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+ void **oentries;
+ void **olimit;
+ void **p;
+ void **nentries;
+ size_t nsize;
+
+ oentries = htab->entries;
+ olimit = oentries + htab->size;
+
+ /* Resize only when table after removal of unused elements is either
+ too full or too empty. */
+ if (htab->n_elements * 2 > htab->size)
+ nsize = higher_prime_number (htab->n_elements * 2);
+ else
+ nsize = htab->size;
+
+ nentries = malloc (sizeof (void *) * nsize);
+ memset (nentries, 0, sizeof (void *) * nsize);
+ if (nentries == NULL)
+ return 0;
+ htab->entries = nentries;
+ htab->size = nsize;
+
+ p = oentries;
+ do
+ {
+ if (*p)
+ *find_empty_slot_for_expand (htab, hash_fn (*p))
+ = *p;
+
+ p++;
+ }
+ while (p < olimit);
+
+#if 0 /* We can't tell whether this was allocated by the malloc()
+ built into ld.so or the one in the main executable or libc,
+ and calling free() for something that wasn't malloc()ed could
+ do Very Bad Things (TM). Take the conservative approach
+ here, potentially wasting as much memory as actually used by
+ the hash table, even if multiple growths occur. That's not
+ so bad as to require some overengineered solution that would
+ enable us to keep track of how it was allocated. */
+ free (oentries);
+#endif
+ return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+ equal to the given element. To delete an entry, call this with
+ INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+ after doing some checks). To insert an entry, call this with
+ INSERT = 1, then write the value you want into the returned slot.
+ When inserting an entry, NULL may be returned if memory allocation
+ fails. */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+ int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+ unsigned int index;
+ int hash, hash2;
+ size_t size;
+ void **entry;
+
+ if (htab->size * 3 <= htab->n_elements * 4
+ && htab_expand (htab, hash_fn) == 0)
+ return NULL;
+
+ hash = hash_fn (ptr);
+
+ size = htab->size;
+ index = hash % size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+ }
+
+ empty_entry:
+ if (!insert)
+ return NULL;
+
+ htab->n_elements++;
+ return entry;
+}
+
+inline static int
+hash_tlsdesc(void *p)
+{
+ struct tlsdesc_dynamic_arg *td = p;
+
+ /* We know all entries are for the same module, so ti_offset is the
+ only distinguishing entry. */
+ return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+ struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+ return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+ size_t idx = map->l_tls_modid;
+ struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+ /* Find the place in the dtv slotinfo list. */
+ do
+ {
+ /* Does it fit in the array of this list element? */
+ if (idx < listp->len)
+ {
+ /* We should never get here for a module in static TLS, so
+ we can assume that, if the generation count is zero, we
+ still haven't determined the generation count for this
+ module. */
+ if (listp->slotinfo[idx].gen)
+ return listp->slotinfo[idx].gen;
+ else
+ break;
+ }
+ idx -= listp->len;
+ listp = listp->next;
+ }
+ while (listp != NULL);
+
+ /* If we get to this point, the module still hasn't been assigned an
+ entry in the dtv slotinfo data structures, and it will when we're
+ done with relocations. At that point, the module will get a
+ generation number that is one past the current generation, so
+ return exactly that. */
+ return GL(dl_tls_generation) + 1;
+}
+
+void *
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+ struct hashtab *ht;
+ void **entry;
+ struct tlsdesc_dynamic_arg *td, test;
+
+ /* FIXME: We could use a per-map lock here, but is it worth it? */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+ ht = map->l_mach.tlsdesc_table;
+ if (! ht)
+ {
+ ht = htab_create ();
+ if (! ht)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 0;
+ }
+ map->l_mach.tlsdesc_table = ht;
+ }
+
+ test.tlsinfo.ti_module = map->l_tls_modid;
+ test.tlsinfo.ti_offset = ti_offset;
+ entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+ if (*entry)
+ {
+ td = *entry;
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+ }
+
+ *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+ /* This may be higher than the map's generation, but it doesn't
+ matter much. Worst case, we'll have one extra DTV update per
+ thread. */
+ td->gen_count = map_generation (map);
+ td->tlsinfo = test.tlsinfo;
+
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+ from attempting to resolve the same TLS descriptor without busy
+ waiting. Ideally, we should be able to release the lock right
+ after changing td->entry, and then using say a condition variable
+ or a futex wake to wake up any waiting threads, but let's try to
+ avoid introducing such dependencies. */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+ if (caller != td->entry)
+ return 1;
+
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ if (caller != td->entry)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 1;
+ }
+
+ td->entry = _dl_tlsdesc_resolve_hold;
+
+ return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* The following 2 functions take an entry_check_offset argument.
+ It's computed by the caller as an offset between its entry point
+ and the call site, such that by adding the built-in return address
+ that is implicitly passed to the function with this offset, we can
+ easily obtain the caller's entry point to compare with the entry
+ point given in the TLS descriptor. If it's changed, we want to
+ return immediately. */
+
+/* These macros are copied from elf/dl-reloc.c */
+
+#define CHECK_STATIC_TLS(map, sym_map) \
+ do { \
+ if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET \
+ || ((sym_map)->l_tls_offset \
+ == FORCED_DYNAMIC_TLS_OFFSET), 0)) \
+ _dl_allocate_static_tls (sym_map); \
+ } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map) \
+ (__builtin_expect ((sym_map)->l_tls_offset \
+ != FORCED_DYNAMIC_TLS_OFFSET, 1) \
+ && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1) \
+ || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+ The argument location is used to hold a pointer to the relocation. */
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+ struct link_map *l)
+{
+ const ElfW(Rela) *reloc = td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p
+ (td, (void*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_PLT)]) + l->l_addr)))
+ return;
+
+ /* The code below was borrowed from _dl_fixup(). */
+ const ElfW(Sym) *const symtab
+ = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+ const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+ lookup_t result;
+
+ /* Look up the target symbol. If the normal lookup rules are not
+ used don't look in the global scope. */
+ if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+ && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+ {
+ const struct r_found_version *version = NULL;
+
+ if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+ {
+ const ElfW(Half) *vernum =
+ (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+ version = &l->l_versions[ndx];
+ if (version->hash == 0)
+ version = NULL;
+ }
+
+ result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+ l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+ DL_LOOKUP_ADD_DEPENDENCY, NULL);
+ }
+ else
+ {
+ /* We already found the symbol. The module (and therefore its load
+ address) is also known. */
+ result = l;
+ }
+
+ if (! sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+ {
+# ifndef SHARED
+ CHECK_STATIC_TLS (l, result);
+# else
+ if (!TRY_STATIC_TLS (l, result))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+ {
+ td->arg = (void*)(sym->st_value - result->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+ ptrdiff_t entry_check_offset)
+{
+ /* Maybe we're lucky and can return early. */
+ if (__builtin_return_address (0) - entry_check_offset != td->entry)
+ return;
+
+ /* Locking here will stop execution until the runnign resolver runs
+ _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+ FIXME: We'd be better off waiting on a condition variable, such
+ that we didn't have to hold the lock throughout the relocation
+ processing. */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif /* USE_TLS */
+
+void
+_dl_unmap (struct link_map *map)
+{
+ __munmap ((void *) (map)->l_map_start,
+ (map)->l_map_end - (map)->l_map_start);
+
+#if USE_TLS && SHARED
+ /* _dl_unmap is only called for dlopen()ed libraries, for which
+ calling free() is safe, or before we've completed the initial
+ relocation, in which case calling free() is probably pointless,
+ but still safe. */
+ if (map->l_mach.tlsdesc_table)
+ htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/x86_64/tlsdesc.sym
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/tlsdesc.sym 2006-01-13 18:16:22.000000000 -0500
@@ -0,0 +1,20 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+#if defined USE_TLS
+
+DTV_OFFSET offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+
+#endif
--
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
Secretary for FSF Latin America http://www.fsfla.org/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}