This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: TLS improvements for IA32 and AMD64/EM64T


On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:

> Over the past few months, I've been working on porting to IA32 and
> AMD64/EM64T the interesting bits of the TLS design I came up with for
> FR-V, achieving some impressive speedups along with slight code size
> reductions in the most common cases.

> Although the design is not set in stone yet, it's fully implemented
> and functional with patches I'm about to post for binutils, gcc and
> glibc mainline, as follow-ups to this message, except that the GCC
> patch will go to gcc-patches, as expected.

This is the glibc portion of the implementation.  I've built it for
amd64-linux-gnu and i686-pc-linux-gnu, with both the current TLS
dialect and the new -mtls-dialect=gnu2, without regressions.

The patch also adds a test to fix a problem I ran into while testing
these optimizations, that turned out to be possible to trigger with
the old code.

It also introduces one missing initial-exec annotation to a data
symbol, removing code size differences between the dynamic loaders.
In fact, after my latest build, I found out the size comparisons in
the specs document I've just posted were incorrect.  With the latest
build, from scratch, all libraries had the same sizes, except for
libpthread.so, libmemusage.so and TLS test libraries in elf/ and
nptl/, that were *always* smaller with the new dialect on both
architectures.  Yay!

Here's the patch.  There are no dependencies on the new code in
binutils and GCC.

? ports
Index: ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* elf/dl-conflict.c (TRY_STATIC_TLS): Dummy define.
	* elf/dl-reloc.c (_dl_try_allocate_static_tls): Extract from...
	(_dl_allocate_static_tls): ... here.  Rearrange failure path.
	(TRY_STATIC_TLS): New macro.
	* elf/dynamic-link.h (elf_get_dynamic_info): Adjust DT_TLSDESC_GOT
	and DT_TLSDESC_PLT.
	* elf/elf.h (DT_TLSDESC_GOT, DT_TLSDESC_PLT): Define.
	(R_386_TLS_GOTDESC, R_386_TLS_DESC, R_386_TLS_DESC_CALL): Define.
	(R_X86_64_PC64, R_X86_GOTOFF64, R_X86_64_GOTPC32): Merge from
	binutils.
	(R_X86_64_GOTPC32_TLSDESC, R_X86_64_TLSDESC,
	R_X86_64_TLSDESC_CALL): Define.
	(R_386_NUM, R_X86_64_NUM): Adjust.
	* sysdeps/i386/Makefile (sysdep-dl-routines, sysdep_routines,
	systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
	(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
	* sysdeps/i386/dl-lookupcfg.h: New file.  Introduce _dl_unmap to
	release tlsdesc_table.
	* sysdeps/i386/dl-machine.h: Include dl-tlsdesc.h.
	(elf_machine_type_class): Mark R_386_TLS_DESC as PLT class.
	(elf_machine_rel): Handle R_386_TLS_DESC.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
	(elf_machine_lazy_rela): Likewise.
	* sysdeps/i386/dl-tls.h (struct dl_tls_index): Name it.
	* sysdeps/i386/dl-tlsdesc.S: New file.
	* sysdeps/i386/dl-tlsdesc.h: New file.
	* sysdeps/i386/tlsdesc.c: New file.
	* sysdeps/i386/tlsdesc.sym: New file.
	* sysdeps/i386/bits/linkmap.h (struct link_map_machine): Add
	tlsdesc_table.
	* sysdeps/x86_64/Makefile (sysdep-dl-routines, sysdep_routines,
	systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
	(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
	* sysdeps/x86_64/dl-lookupcfg.h: New file.  Introduce _dl_unmap to
	release tlsdesc_table.
	* sysdeps/x86_64/dl-machine.h: Include dl-tlsdesc.h.
	(elf_machine_runtime_setup): Set up lazy TLSDESC GOT entry.
	(elf_machine_type_class): Mark R_X86_64_TLSDESC as PLT class.
	(ELF_MACHINE_DT_TLSDESC): Define.
	(elf_machine_rel): Handle R_X86_64_TLSDESC.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
	* sysdeps/x86_64/dl-tls.h (struct dl_tls_index): Name it.
	(__tls_get_addr): Do not declare for non-shared compiles.
	* sysdeps/x86_64/dl-tlsdesc.S: New file.
	* sysdeps/x86_64/dl-tlsdesc.h: New file.
	* sysdeps/x86_64/tlsdesc.c: New file.
	* sysdeps/x86_64/tlsdesc.sym: New file.
	* sysdeps/x86_64/bits/linkmap.h (struct link_map_machine): Add
	tlsdesc_table for both 32- and 64-bit structs.

	* include/link.h (FORCED_DYNAMIC_TLS_OFFSET): Define.
	* sysdeps/generic/dl-tls.c (_dl_allocate_tls_init): Check for it.
	* elf/dl-close.c (_dl_close): Likewise.
	* elf/dl-reloc.c (CHECK_STATIC_TLS): Likewise.
	(_dl_allocate_static_tls): Likewise.
	* sysdeps/generic/dl-tls.c (__tls_get_addr): Protect from race
	conditions in setting l_tls_offset to it.
	* elf/tst-tls16.c: New test.
	* elf/tst-tlsmod16a.c, elf/tst-tlsmod16b.c: Supporting libraries.
	* elf/Makefile: Set it up.

	* elf/dl-tsd.c (__libc_dl_error_tsd): Access static TLS with
	initial exec.

Index: elf/Makefile
===================================================================
RCS file: /cvs/glibc/libc/elf/Makefile,v
retrieving revision 1.303
diff -u -p -r1.303 Makefile
--- elf/Makefile 7 Jul 2005 22:59:14 -0000 1.303
+++ elf/Makefile 16 Sep 2005 04:23:55 -0000
@@ -162,8 +162,9 @@ tests += loadtest restest1 preloadtest l
 	 neededtest3 neededtest4 unload2 lateglobal initfirst global \
 	 restest2 next dblload dblunload reldep5 reldep6 reldep7 reldep8 \
 	 circleload1 tst-tls3 tst-tls4 tst-tls5 tst-tls6 tst-tls7 tst-tls8 \
-	 tst-tls10 tst-tls11 tst-tls12 tst-tls13 tst-tls14 tst-tls15 tst-align \
-	 tst-align2 $(tests-execstack-$(have-z-execstack)) tst-dlmodcount \
+	 tst-tls10 tst-tls11 tst-tls12 tst-tls13 tst-tls14 tst-tls15 \
+	 tst-tls16 tst-align tst-align2 \
+	 $(tests-execstack-$(have-z-execstack)) tst-dlmodcount \
 	 tst-dlopenrpath tst-deep1 tst-dlmopen1 tst-dlmopen2 tst-dlmopen3 \
 	 unload3 unload4 unload5 unload6 tst-audit1 tst-global1 order2 \
 	 tst-stackguard1
@@ -194,7 +195,7 @@ modules-names = testobj1 testobj2 testob
 		tst-tlsmod5 tst-tlsmod6 tst-tlsmod7 tst-tlsmod8 \
 		tst-tlsmod9 tst-tlsmod10 tst-tlsmod11 tst-tlsmod12 \
 		tst-tlsmod13 tst-tlsmod13a tst-tlsmod14a tst-tlsmod14b \
-		tst-tlsmod15a tst-tlsmod15b \
+		tst-tlsmod15a tst-tlsmod15b tst-tlsmod16a tst-tlsmod16b \
 		circlemod1 circlemod1a circlemod2 circlemod2a \
 		circlemod3 circlemod3a \
 		reldep8mod1 reldep8mod2 reldep8mod3 \
@@ -481,6 +482,7 @@ tst-tlsmod12.so-no-z-defs = yes
 tst-tlsmod14a.so-no-z-defs = yes
 tst-tlsmod14b.so-no-z-defs = yes
 tst-tlsmod15a.so-no-z-defs = yes
+tst-tlsmod16b.so-no-z-defs = yes
 circlemod2.so-no-z-defs = yes
 circlemod3.so-no-z-defs = yes
 circlemod3a.so-no-z-defs = yes
@@ -699,6 +701,9 @@ $(objpfx)tst-tls14.out: $(objpfx)tst-tls
 $(objpfx)tst-tls15: $(libdl)
 $(objpfx)tst-tls15.out: $(objpfx)tst-tlsmod15a.so $(objpfx)tst-tlsmod15b.so
 
+$(objpfx)tst-tls16: $(libdl)
+$(objpfx)tst-tls16.out: $(objpfx)tst-tlsmod16a.so $(objpfx)tst-tlsmod16b.so
+
 CFLAGS-tst-align.c = $(stack-align-test-flags)
 CFLAGS-tst-align2.c = $(stack-align-test-flags)
 CFLAGS-tst-alignmod.c = $(stack-align-test-flags)
Index: elf/dl-close.c
===================================================================
RCS file: /cvs/glibc/libc/elf/dl-close.c,v
retrieving revision 1.116
diff -u -p -r1.116 dl-close.c
--- elf/dl-close.c 27 Apr 2005 01:32:01 -0000 1.116
+++ elf/dl-close.c 16 Sep 2005 04:23:57 -0000
@@ -436,7 +436,8 @@ _dl_close (void *_map)
 		/* All dynamically loaded modules with TLS are unloaded.  */
 		GL(dl_tls_max_dtv_idx) = GL(dl_tls_static_nelem);
 
-	      if (imap->l_tls_offset != NO_TLS_OFFSET)
+	      if (imap->l_tls_offset != NO_TLS_OFFSET
+		  && imap->l_tls_offset != FORCED_DYNAMIC_TLS_OFFSET)
 		{
 		  /* Collect a contiguous chunk built from the objects in
 		     this search list, going in either direction.  When the
Index: elf/dl-conflict.c
===================================================================
RCS file: /cvs/glibc/libc/elf/dl-conflict.c,v
retrieving revision 1.12
diff -u -p -r1.12 dl-conflict.c
--- elf/dl-conflict.c 14 Oct 2004 01:53:55 -0000 1.12
+++ elf/dl-conflict.c 16 Sep 2005 04:23:57 -0000
@@ -1,5 +1,5 @@
 /* Resolve conflicts against already prelinked libraries.
-   Copyright (C) 2001, 2002, 2003, 2004 Free Software Foundation, Inc.
+   Copyright (C) 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Jakub Jelinek <jakub@redhat.com>, 2001.
 
@@ -45,6 +45,7 @@ _dl_resolve_conflicts (struct link_map *
 #define RESOLVE_MAP(ref, version, flags) (*ref = NULL, NULL)
 #define RESOLVE(ref, version, flags) (*ref = NULL, 0)
 #define CHECK_STATIC_TLS(ref_map, sym_map) ((void) 0)
+#define TRY_STATIC_TLS(ref_map, sym_map) (0)
 #define RESOLVE_CONFLICT_FIND_MAP(map, r_offset) \
   do {									      \
     while ((resolve_conflict_map->l_map_end < (ElfW(Addr)) (r_offset))	      \
Index: elf/dl-reloc.c
===================================================================
RCS file: /cvs/glibc/libc/elf/dl-reloc.c,v
retrieving revision 1.101
diff -u -p -r1.101 dl-reloc.c
--- elf/dl-reloc.c 7 Jul 2005 02:24:54 -0000 1.101
+++ elf/dl-reloc.c 16 Sep 2005 04:23:58 -0000
@@ -1,5 +1,5 @@
 /* Relocate a shared object and resolve its references to other loaded objects.
-   Copyright (C) 1995-2004, 2005 Free Software Foundation, Inc.
+   Copyright (C) 1995-2005 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -44,16 +44,17 @@
    This function intentionally does not return any value but signals error
    directly, as static TLS should be rare and code handling it should
    not be inlined as much as possible.  */
-void
-internal_function __attribute_noinline__
-_dl_allocate_static_tls (struct link_map *map)
+int
+internal_function
+_dl_try_allocate_static_tls (struct link_map *map)
 {
-  /* If the alignment requirements are too high fail.  */
-  if (map->l_tls_align > GL(dl_tls_static_align))
+  /* If we've already used the variable with dynamic access, or if the
+     alignment requirements are too high, fail.  */
+  if (map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET
+      || map->l_tls_align > GL(dl_tls_static_align))
     {
     fail:
-      _dl_signal_error (0, map->l_name, NULL, N_("\
-cannot allocate memory in static TLS block"));
+      return -1;
     }
 
 # if TLS_TCB_AT_TP
@@ -107,6 +108,20 @@ cannot allocate memory in static TLS blo
     }
   else
     map->l_need_tls_init = 1;
+
+  return 0;
+}
+
+void
+internal_function __attribute_noinline__
+_dl_allocate_static_tls (struct link_map *map)
+{
+  if (map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET
+      || _dl_try_allocate_static_tls (map))
+    {
+      _dl_signal_error (0, map->l_name, NULL, N_("\
+cannot allocate memory in static TLS block"));
+    }
 }
 
 /* Initialize static TLS area and DTV for current (only) thread.
@@ -257,12 +272,20 @@ _dl_relocate_object (struct link_map *l,
        an attempt to allocate it in surplus space on the fly.  If that
        can't be done, we fall back to the error that DF_STATIC_TLS is
        intended to produce.  */
-#define CHECK_STATIC_TLS(map, sym_map)					      \
-    do {								      \
-      if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET, 0))     \
-	_dl_allocate_static_tls (sym_map);				      \
+#define CHECK_STATIC_TLS(map, sym_map)					\
+    do {								\
+      if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET	\
+			    || ((sym_map)->l_tls_offset			\
+				== FORCED_DYNAMIC_TLS_OFFSET), 0))	\
+	_dl_allocate_static_tls (sym_map);				\
     } while (0)
 
+#define TRY_STATIC_TLS(map, sym_map)					\
+    (__builtin_expect ((sym_map)->l_tls_offset				\
+		       != FORCED_DYNAMIC_TLS_OFFSET, 1)			\
+     && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1)	\
+	 || _dl_try_allocate_static_tls (sym_map) == 0))
+
 #include "dynamic-link.h"
 
     ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling);
Index: elf/dl-tsd.c
===================================================================
RCS file: /cvs/glibc/libc/elf/dl-tsd.c,v
retrieving revision 1.3
diff -u -p -r1.3 dl-tsd.c
--- elf/dl-tsd.c 4 Dec 2002 12:27:51 -0000 1.3
+++ elf/dl-tsd.c 16 Sep 2005 04:23:58 -0000
@@ -1,5 +1,5 @@
 /* Thread-local data used by error handling for runtime dynamic linker.
-   Copyright (C) 2002 Free Software Foundation, Inc.
+   Copyright (C) 2002, 2005 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -49,7 +49,7 @@ void **(*_dl_error_catch_tsd) (void) __a
 void ** __attribute__ ((const))
 __libc_dl_error_tsd (void)
 {
-  static __thread void *data;
+  static __thread void *data attribute_tls_model_ie;
   return &data;
 }
 
Index: elf/dynamic-link.h
===================================================================
RCS file: /cvs/glibc/libc/elf/dynamic-link.h,v
retrieving revision 1.54
diff -u -p -r1.54 dynamic-link.h
--- elf/dynamic-link.h 15 Mar 2005 22:57:25 -0000 1.54
+++ elf/dynamic-link.h 16 Sep 2005 04:23:59 -0000
@@ -141,6 +141,10 @@ elf_get_dynamic_info (struct link_map *l
 # if ! ELF_MACHINE_NO_REL
       ADJUST_DYN_INFO (DT_REL);
 # endif
+# if ELF_MACHINE_DT_TLSDESC
+      ADJUST_DYN_INFO (VERSYMIDX (DT_TLSDESC_GOT));
+      ADJUST_DYN_INFO (VERSYMIDX (DT_TLSDESC_PLT));
+# endif
       ADJUST_DYN_INFO (DT_JMPREL);
       ADJUST_DYN_INFO (VERSYMIDX (DT_VERSYM));
 # undef ADJUST_DYN_INFO
Index: elf/elf.h
===================================================================
RCS file: /cvs/glibc/libc/elf/elf.h,v
retrieving revision 1.149
diff -u -p -r1.149 elf.h
--- elf/elf.h 7 Aug 2005 07:53:58 -0000 1.149
+++ elf/elf.h 16 Sep 2005 04:24:07 -0000
@@ -715,6 +715,13 @@ typedef struct
    GNU extension.  */
 #define DT_VERSYM	0x6ffffff0
 
+#define DT_TLSDESC_GOT	0x6ffffff7	/* Location of GOT entry used
+					   by TLS descriptor resolver
+					   PLT entry.  */
+#define DT_TLSDESC_PLT	0x6ffffff8	/* Location of PLT entry for
+					   TLS descriptor resolver
+					   calls.  */
+
 #define DT_RELACOUNT	0x6ffffff9
 #define DT_RELCOUNT	0x6ffffffa
 
@@ -1136,8 +1143,16 @@ typedef struct
 #define R_386_TLS_DTPMOD32 35		/* ID of module containing symbol */
 #define R_386_TLS_DTPOFF32 36		/* Offset in TLS block */
 #define R_386_TLS_TPOFF32  37		/* Negated offset in static TLS block */
+#define R_386_TLS_GOTDESC  38		/* GOT offset for TLS descriptor.  */
+#define R_386_TLS_DESC     39		/* TLS descriptor containing
+					   pointer to code and to
+					   argument, returning the TLS
+					   offset for the symbol.  */
+#define R_386_TLS_DESC_CALL 40		/* Marker of call through TLS
+					   descriptor for
+					   relaxation.  */
 /* Keep this the last entry.  */
-#define R_386_NUM	   38
+#define R_386_NUM	   41
 
 /* SUN SPARC specific definitions.  */
 
@@ -2496,8 +2511,16 @@ typedef Elf32_Addr Elf32_Conflict;
 #define R_X86_64_GOTTPOFF	22	/* 32 bit signed PC relative offset
 					   to GOT entry for IE symbol */
 #define R_X86_64_TPOFF32	23	/* Offset in initial TLS block */
+#define R_X86_64_PC64		24	/* PC relative 64 bit */
+#define R_X86_64_GOTOFF64	25	/* 64 bit offset to GOT */
+#define R_X86_64_GOTPC32	26	/* 32 bit signed pc relative
+					   offset to GOT */
+#define R_X86_64_GOTPC32_TLSDESC 27	/* GOT offset for TLS descriptor.  */
+#define R_X86_64_TLSDESC        28	/* TLS descriptor.  */
+#define R_X86_64_TLSDESC_CALL   29	/* Marker for call through TLS
+					   descriptor.  */
 
-#define R_X86_64_NUM		24
+#define R_X86_64_NUM		30
 
 
 /* AM33 relocations.  */
Index: elf/tst-tls16.c
===================================================================
RCS file: elf/tst-tls16.c
diff -N elf/tst-tls16.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ elf/tst-tls16.c 16 Sep 2005 04:24:08 -0000
@@ -0,0 +1,52 @@
+#include <dlfcn.h>
+#include <stdio.h>
+
+static int
+do_test (void)
+{
+  void *h = dlopen ("tst-tlsmod16a.so", RTLD_LAZY | RTLD_GLOBAL);
+  if (h == NULL)
+    {
+      puts ("unexpectedly failed to open tst-tlsmod16a.so");
+      exit (1);
+    }
+
+  void *p = dlsym (h, "tlsvar");
+
+  /* This dlopen should indeed fail, because tlsvar was assigned to
+     dynamic TLS, and the new module requests it to be in static TLS.
+     However, there's a possibility that dlopen succeeds if the
+     variable is, for whatever reason, assigned to static TLS, or if
+     the module fails to require static TLS, or even if TLS is not
+     supported.  */
+  h = dlopen ("tst-tlsmod16b.so", RTLD_NOW | RTLD_GLOBAL);
+  if (h == NULL)
+    {
+      return 0;
+    }
+
+  puts ("unexpectedly succeeded to open tst-tlsmod16b.so");
+
+
+  void *(*fp) (void) = (void *(*) (void)) dlsym (h, "in_dso");
+  if (fp == NULL)
+    {
+      puts ("cannot find in_dso");
+      exit (1);
+    }
+
+  /* If the dlopen passes, at least make sure the address returned by
+     dlsym is the same as that returned by the initial-exec access.
+     If the variable was assigned to dynamic TLS during dlsym, this
+     portion will fail.  */
+  if (fp () != p)
+    {
+      puts ("returned values do not match");
+      exit (1);
+    }
+
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
Index: elf/tst-tlsmod16a.c
===================================================================
RCS file: elf/tst-tlsmod16a.c
diff -N elf/tst-tlsmod16a.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ elf/tst-tlsmod16a.c 16 Sep 2005 04:24:08 -0000
@@ -0,0 +1,8 @@
+#include <tls.h>
+
+#if defined USE_TLS && defined HAVE___THREAD \
+    && defined HAVE_TLS_MODEL_ATTRIBUTE
+int __thread tlsvar;
+#else
+int tlsvar;
+#endif
Index: elf/tst-tlsmod16b.c
===================================================================
RCS file: elf/tst-tlsmod16b.c
diff -N elf/tst-tlsmod16b.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ elf/tst-tlsmod16b.c 16 Sep 2005 04:24:08 -0000
@@ -0,0 +1,14 @@
+#include <tls.h>
+
+#if defined USE_TLS && defined HAVE___THREAD \
+    && defined HAVE_TLS_MODEL_ATTRIBUTE
+extern __thread int tlsvar __attribute__((tls_model("initial-exec")));
+#else
+extern int tlsvar;
+#endif
+
+void *
+in_dso (void)
+{
+  return &tlsvar;
+}
Index: include/link.h
===================================================================
RCS file: /cvs/glibc/libc/include/link.h,v
retrieving revision 1.37
diff -u -p -r1.37 link.h
--- include/link.h 18 Mar 2005 10:56:02 -0000 1.37
+++ include/link.h 16 Sep 2005 04:24:12 -0000
@@ -296,6 +296,15 @@ struct link_map
 # ifndef NO_TLS_OFFSET
 #  define NO_TLS_OFFSET	0
 # endif
+# ifndef FORCED_DYNAMIC_TLS_OFFSET
+#  if NO_TLS_OFFSET == 0
+#   define FORCED_DYNAMIC_TLS_OFFSET 1
+#  elif NO_TLS_OFFSET == -1
+#   define FORCED_DYNAMIC_TLS_OFFSET -2
+#  else
+#   error "FORCED_DYNAMIC_TLS_OFFSET is not defined"
+#  endif
+# endif
     /* For objects present at startup time: offset in the static TLS block.  */
     ptrdiff_t l_tls_offset;
     /* Index of the module in the dtv array.  */
Index: sysdeps/generic/dl-tls.c
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/generic/dl-tls.c,v
retrieving revision 1.50
diff -u -p -r1.50 dl-tls.c
--- sysdeps/generic/dl-tls.c 20 Mar 2005 22:23:25 -0000 1.50
+++ sysdeps/generic/dl-tls.c 16 Sep 2005 04:24:32 -0000
@@ -417,7 +417,8 @@ _dl_allocate_tls_init (void *result)
 	     not be the generation counter.  */
 	  maxgen = MAX (maxgen, listp->slotinfo[cnt].gen);
 
-	  if (map->l_tls_offset == NO_TLS_OFFSET)
+	  if (map->l_tls_offset == NO_TLS_OFFSET
+	      || map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET)
 	    {
 	      /* For dynamically loaded modules we simply store
 		 the value indicating deferred allocation.  */
@@ -706,6 +707,7 @@ __tls_get_addr (GET_ADDR_ARGS)
   if (__builtin_expect (dtv[0].counter != GL(dl_tls_generation), 0))
     the_map = _dl_update_slotinfo (GET_ADDR_MODULE);
 
+ retry:
   p = dtv[GET_ADDR_MODULE].pointer.val;
 
   if (__builtin_expect (p == TLS_DTV_UNALLOCATED, 0))
@@ -726,6 +728,28 @@ __tls_get_addr (GET_ADDR_ARGS)
 	  the_map = listp->slotinfo[idx].map;
 	}
 
+      /* Make sure that, if a dlopen running in parallel forces the
+	 variable into static storage, we'll wait until the address in
+	 the static TLS block is set up, and use that.  If we're
+	 undecided yet, make sure we make the decision holding the
+	 lock as well.  */
+      if (__builtin_expect (the_map->l_tls_offset
+			    != FORCED_DYNAMIC_TLS_OFFSET, 0))
+	{
+	  __rtld_lock_lock_recursive (GL(dl_load_lock));
+	  if (__builtin_expect (the_map->l_tls_offset == NO_TLS_OFFSET, 1))
+	    {
+	      the_map->l_tls_offset = FORCED_DYNAMIC_TLS_OFFSET;
+	      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+	    }
+	  else
+	    {
+	      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+	      if (__builtin_expect (the_map->l_tls_offset
+				    != FORCED_DYNAMIC_TLS_OFFSET, 1))
+		goto retry;
+	    }
+	}
       p = dtv[GET_ADDR_MODULE].pointer.val = allocate_and_init (the_map);
       dtv[GET_ADDR_MODULE].pointer.is_static = false;
     }
Index: sysdeps/i386/Makefile
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/i386/Makefile,v
retrieving revision 1.20
diff -u -p -r1.20 Makefile
--- sysdeps/i386/Makefile 6 Mar 2005 00:18:16 -0000 1.20
+++ sysdeps/i386/Makefile 16 Sep 2005 04:24:32 -0000
@@ -65,3 +65,13 @@ endif
 ifneq (,$(filter -mno-tls-direct-seg-refs,$(CFLAGS)))
 defines += -DNO_TLS_DIRECT_SEG_REFS
 endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/i386/dl-lookupcfg.h
===================================================================
RCS file: sysdeps/i386/dl-lookupcfg.h
diff -N sysdeps/i386/dl-lookupcfg.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/i386/dl-lookupcfg.h 16 Sep 2005 04:24:32 -0000
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/i386/dl-machine.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/i386/dl-machine.h,v
retrieving revision 1.134
diff -u -p -r1.134 dl-machine.h
--- sysdeps/i386/dl-machine.h 6 Mar 2005 00:08:34 -0000 1.134
+++ sysdeps/i386/dl-machine.h 16 Sep 2005 04:24:34 -0000
@@ -25,6 +25,7 @@
 #include <sys/param.h>
 #include <sysdep.h>
 #include <tls.h>
+#include <dl-tlsdesc.h>
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
 static inline int __attribute__ ((unused))
@@ -248,7 +249,7 @@ _dl_start_user:\n\
 # define elf_machine_type_class(type) \
   ((((type) == R_386_JMP_SLOT || (type) == R_386_TLS_DTPMOD32		      \
      || (type) == R_386_TLS_DTPOFF32 || (type) == R_386_TLS_TPOFF32	      \
-     || (type) == R_386_TLS_TPOFF)					      \
+     || (type) == R_386_TLS_TPOFF || (type) == R_386_TLS_DESC)		      \
     * ELF_RTYPE_CLASS_PLT)						      \
    | (((type) == R_386_COPY) * ELF_RTYPE_CLASS_COPY))
 #else
@@ -375,6 +376,34 @@ elf_machine_rel (struct link_map *map, c
 	    *reloc_addr = sym->st_value;
 # endif
 	  break;
+	case R_386_TLS_DESC:
+# ifndef RTLD_BOOTSTRAP
+	  if (sym != NULL)
+# endif
+	    {
+	      struct tlsdesc volatile *td =
+		(struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+#  ifndef SHARED
+	      CHECK_STATIC_TLS (map, sym_map);
+#  else
+	      if (!TRY_STATIC_TLS (map, sym_map))
+		{
+		  td->arg = _dl_make_tlsdesc_dynamic
+		    (sym_map, sym->st_value + (ElfW(Word))td->arg);
+		  td->entry = _dl_tlsdesc_dynamic;
+		}
+	      else
+#  endif
+# endif
+		{
+		  td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+				    + (ElfW(Word))td->arg);
+		  td->entry = _dl_tlsdesc_return;
+		}
+	    }
+	  break;
 	case R_386_TLS_TPOFF32:
 	  /* The offset is positive, backward from the thread pointer.  */
 # ifdef RTLD_BOOTSTRAP
@@ -488,6 +517,34 @@ elf_machine_rela (struct link_map *map, 
 	     Therefore the offset is already correct.  */
 	  *reloc_addr = (sym == NULL ? 0 : sym->st_value) + reloc->r_addend;
 	  break;
+	case R_386_TLS_DESC:
+# ifndef RTLD_BOOTSTRAP
+	  if (sym != NULL)
+# endif
+	    {
+	      struct tlsdesc volatile *td =
+		(struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+#  ifndef SHARED
+	      CHECK_STATIC_TLS (map, sym_map);
+#  else
+	      if (!TRY_STATIC_TLS (map, sym_map))
+		{
+		  td->arg = _dl_make_tlsdesc_dynamic
+		    (sym_map, sym->st_value + reloc->r_addend);
+		  td->entry = _dl_tlsdesc_dynamic;
+		}
+	      else
+#  endif
+# endif
+		{
+		  td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+				    + reloc->r_addend);
+		  td->entry = _dl_tlsdesc_return;
+		}
+	    }
+	  break;
 	case R_386_TLS_TPOFF32:
 	  /* The offset is positive, backward from the thread pointer.  */
 	  /* We know the offset of object the symbol is contained in.
@@ -582,6 +639,55 @@ elf_machine_lazy_rel (struct link_map *m
 	*reloc_addr = (map->l_mach.plt
 		       + (((Elf32_Addr) reloc_addr) - map->l_mach.gotplt) * 4);
     }
+#ifdef USE_TLS
+  else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+    {
+      struct tlsdesc volatile * __attribute__((__unused__)) td =
+	(struct tlsdesc volatile *)reloc_addr;
+
+      /* Handle relocations that reference the local *ABS* in a simple
+	 way, so as to preserve a potential addend.  */
+      if (ELF32_R_SYM (reloc->r_info) == 0)
+	td->entry = _dl_tlsdesc_resolve_abs_plus_addend;
+      /* Given a known-zero addend, we can store a pointer to the
+	 reloc in the arg position.  */
+      else if (td->arg == 0)
+	{
+	  td->arg = (void*)reloc;
+	  td->entry = _dl_tlsdesc_resolve_rel;
+	}
+      else
+	{
+	  /* We could handle non-*ABS* relocations with non-zero addends
+	     by allocating dynamically an arg to hold a pointer to the
+	     reloc, but that sounds pointless.  */
+	  const Elf32_Rel *const r = reloc;
+	  /* The code below was borrowed from elf_dynamic_do_rel().  */
+	  const ElfW(Sym) *const symtab =
+	    (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+
+#ifdef RTLD_BOOTSTRAP
+	  /* The dynamic linker always uses versioning.  */
+	  assert (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL);
+#else
+	  if (map->l_info[VERSYMIDX (DT_VERSYM)])
+#endif
+	    {
+	      const ElfW(Half) *const version =
+		(const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+	      ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
+	      elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
+			       &map->l_versions[ndx],
+			       (void *) (l_addr + r->r_offset));
+	    }
+#ifndef RTLD_BOOTSTRAP
+	  else
+	    elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
+			     (void *) (l_addr + r->r_offset));
+#endif
+	}
+    }
+#endif
   else
     _dl_reloc_bad_type (map, r_type, 1);
 }
@@ -593,6 +699,22 @@ __attribute__ ((always_inline))
 elf_machine_lazy_rela (struct link_map *map,
 		       Elf32_Addr l_addr, const Elf32_Rela *reloc)
 {
+#ifdef USE_TLS
+  Elf32_Addr *const reloc_addr = (void *) (l_addr + reloc->r_offset);
+  const unsigned int r_type = ELF32_R_TYPE (reloc->r_info);
+  if (__builtin_expect (r_type == R_386_JMP_SLOT, 1))
+    ;
+  else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+    {
+      struct tlsdesc volatile * __attribute__((__unused__)) td =
+	(struct tlsdesc volatile *)reloc_addr;
+
+      td->arg = (void*)reloc;
+      td->entry = _dl_tlsdesc_resolve_rela;
+    }
+  else
+    _dl_reloc_bad_type (map, r_type, 1);
+#endif
 }
 
 #endif	/* !RTLD_BOOTSTRAP */
Index: sysdeps/i386/dl-tls.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/i386/dl-tls.h,v
retrieving revision 1.8
diff -u -p -r1.8 dl-tls.h
--- sysdeps/i386/dl-tls.h 6 Mar 2004 08:12:40 -0000 1.8
+++ sysdeps/i386/dl-tls.h 16 Sep 2005 04:24:35 -0000
@@ -19,7 +19,7 @@
 
 
 /* Type used for the representation of TLS information in the GOT.  */
-typedef struct
+typedef struct dl_tls_index
 {
   unsigned long int ti_module;
   unsigned long int ti_offset;
Index: sysdeps/i386/dl-tlsdesc.S
===================================================================
RCS file: sysdeps/i386/dl-tlsdesc.S
diff -N sysdeps/i386/dl-tlsdesc.S
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/i386/dl-tlsdesc.S 16 Sep 2005 04:24:35 -0000
@@ -0,0 +1,216 @@
+/* Thread-local storage handling in the ELF dynamic linker.  i386 version.
+   Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+	.text
+#ifdef USE_TLS
+	.hidden _dl_tlsdesc_return
+	.global	_dl_tlsdesc_return
+	.type	_dl_tlsdesc_return,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_return:
+	movl	4(%eax), %eax
+	ret
+	cfi_endproc
+	.size	_dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+#ifdef SHARED
+	.hidden _dl_tlsdesc_dynamic
+	.global	_dl_tlsdesc_dynamic
+	.type	_dl_tlsdesc_dynamic,@function
+	
+     /* %eax points to the TLS descriptor, such that 0(%eax) points to
+	_dl_tlsdesc_dynamic itself, and 4(%eax) points to a struct
+	tlsdesc_dynamic_arg object.  It must return in %eax the offset
+	between the thread pointer and the object denoted by the
+	argument, without clobbering any registers.
+
+	The assembly code that follows is a rendition of the following
+	C code, hand-optimized a little bit.
+	
+ptrdiff_t
+__attribute__ ((__regparm__ (1)))
+_dl_tlsdesc_dynamic (struct tlsdesc *tdp) 
+{
+  struct tlsdesc_dynamic_arg *td = tdp->arg; 
+  dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+  if (__builtin_expect (td->gen_count <= dtv[0].counter
+			&& (dtv[td->tlsinfo.ti_module].pointer.val
+			    != TLS_DTV_UNALLOCATED),
+			1))
+    return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset 
+      - __thread_pointer;
+
+  return ___tls_get_addr (&td->tlsinfo) - __thread_pointer;
+}
+*/
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_dynamic:
+	/* Preserve call-clobbered registers.  
+	   We need two scratch regs anyway.
+	   FIXME: maybe remove the requirement to preserve them?  */
+	subl	$28, %esp
+	cfi_adjust_cfa_offset (28)
+	movl	%ecx, 20(%esp)
+	movl	%edx, 24(%esp)
+	movl	TLSDESC_ARG(%eax), %eax
+	movl	%gs:DTV_OFFSET, %edx
+	movl	TLSDESC_GEN_COUNT(%eax), %ecx
+	cmpl	(%edx), %ecx
+	ja	.Lslow
+	movl	TLSDESC_MODID(%eax), %ecx
+	movl	(%edx,%ecx,8), %edx
+	cmpl	$-1, %edx
+	je	.Lslow
+	movl	TLSDESC_MODOFF(%eax), %eax
+	addl	%edx, %eax
+.Lret:
+	movl	20(%esp), %ecx
+	subl	%gs:0, %eax
+	movl	24(%esp), %edx
+	addl	$28, %esp
+	cfi_adjust_cfa_offset (-28)
+	ret
+	.p2align 4,,7
+.Lslow:
+	cfi_adjust_cfa_offset (28)
+	movl	%ebx, 16(%esp)
+	call	__i686.get_pc_thunk.bx
+	addl	$_GLOBAL_OFFSET_TABLE_, %ebx
+	call	___tls_get_addr@PLT
+	movl	16(%esp), %ebx
+	jmp	.Lret
+	cfi_endproc
+	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+	
+	.hidden _dl_tlsdesc_resolve_abs_plus_addend
+	.global	_dl_tlsdesc_resolve_abs_plus_addend
+	.type	_dl_tlsdesc_resolve_abs_plus_addend,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_abs_plus_addend:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_abs_plus_addend_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_abs_plus_addend, .-_dl_tlsdesc_resolve_abs_plus_addend
+	
+	.hidden _dl_tlsdesc_resolve_rel
+	.global	_dl_tlsdesc_resolve_rel
+	.type	_dl_tlsdesc_resolve_rel,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_rel:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_rel_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_rel, .-_dl_tlsdesc_resolve_rel
+
+	.hidden _dl_tlsdesc_resolve_rela
+	.global	_dl_tlsdesc_resolve_rela
+	.type	_dl_tlsdesc_resolve_rela,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_rela:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_rela_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+	.hidden _dl_tlsdesc_resolve_hold
+	.global	_dl_tlsdesc_resolve_hold
+	.type	_dl_tlsdesc_resolve_hold,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_hold:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_hold_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
+	
+#endif /* USE_TLS */
Index: sysdeps/i386/dl-tlsdesc.h
===================================================================
RCS file: sysdeps/i386/dl-tlsdesc.h
diff -N sysdeps/i386/dl-tlsdesc.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/i386/dl-tlsdesc.h 16 Sep 2005 04:24:35 -0000
@@ -0,0 +1,59 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+   i386 version.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _I386_DL_TLSDESC_H
+# define _I386_DL_TLSDESC_H 1
+
+/* Type used to represent a TLS descriptor in the GOT.  */
+struct tlsdesc
+{
+  ptrdiff_t __attribute__((regparm(1))) (*entry)(struct tlsdesc *);
+  void *arg;
+};
+
+typedef struct dl_tls_index
+{
+  unsigned long int ti_module;
+  unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+   needs dynamic TLS offsets.  */
+struct tlsdesc_dynamic_arg
+{
+  tls_index tlsinfo;
+  size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+  _dl_tlsdesc_return(struct tlsdesc *),
+  _dl_tlsdesc_resolve_abs_plus_addend(struct tlsdesc *),
+  _dl_tlsdesc_resolve_rel(struct tlsdesc *),
+  _dl_tlsdesc_resolve_rela(struct tlsdesc *),
+  _dl_tlsdesc_resolve_hold(struct tlsdesc *);
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+  _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/i386/tlsdesc.c
===================================================================
RCS file: sysdeps/i386/tlsdesc.c
diff -N sysdeps/i386/tlsdesc.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/i386/tlsdesc.c 16 Sep 2005 04:24:37 -0000
@@ -0,0 +1,657 @@
+/* Manage TLS descriptors.  i386 version.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+
+#ifdef USE_TLS
+# ifdef SHARED
+
+extern void weak_function free (void *ptr);
+
+/* The hashcode handling code below is heavily inspired in libiberty's
+   hashtab code, but with most adaptation points and support for
+   deleting elements removed.
+
+   Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+   Contributed by Vladimir Makarov (vmakarov@cygnus.com).  */
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+  /* These are primes that are near, but slightly smaller than, a
+     power of two.  */
+  static const unsigned long primes[] = {
+    (unsigned long) 7,
+    (unsigned long) 13,
+    (unsigned long) 31,
+    (unsigned long) 61,
+    (unsigned long) 127,
+    (unsigned long) 251,
+    (unsigned long) 509,
+    (unsigned long) 1021,
+    (unsigned long) 2039,
+    (unsigned long) 4093,
+    (unsigned long) 8191,
+    (unsigned long) 16381,
+    (unsigned long) 32749,
+    (unsigned long) 65521,
+    (unsigned long) 131071,
+    (unsigned long) 262139,
+    (unsigned long) 524287,
+    (unsigned long) 1048573,
+    (unsigned long) 2097143,
+    (unsigned long) 4194301,
+    (unsigned long) 8388593,
+    (unsigned long) 16777213,
+    (unsigned long) 33554393,
+    (unsigned long) 67108859,
+    (unsigned long) 134217689,
+    (unsigned long) 268435399,
+    (unsigned long) 536870909,
+    (unsigned long) 1073741789,
+    (unsigned long) 2147483647,
+					/* 4294967291L */
+    ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+  };
+
+  const unsigned long *low = &primes[0];
+  const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+  while (low != high)
+    {
+      const unsigned long *mid = low + (high - low) / 2;
+      if (n > *mid)
+	low = mid + 1;
+      else
+	high = mid;
+    }
+
+#if 0
+  /* If we've run out of primes, abort.  */
+  if (n > *low)
+    {
+      fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+      abort ();
+    }
+#endif
+
+  return *low;
+}
+
+struct hashtab
+{
+  /* Table itself.  */
+  void **entries;
+
+  /* Current size (in entries) of the hash table */
+  size_t size;
+
+  /* Current number of elements.  */
+  size_t n_elements;
+};  
+
+inline static struct hashtab *
+htab_create (void)
+{
+  struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+  if (! ht)
+    return NULL;
+  ht->size = 3;
+  ht->entries = malloc (sizeof (void *) * ht->size);
+  if (! ht->entries)
+    return NULL;
+  
+  ht->n_elements = 0;
+
+  memset (ht->entries, 0, sizeof (void *) * ht->size);
+  
+  return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+   free().  See the discussion below.  */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+  int i;
+
+  for (i = htab->size - 1; i >= 0; i--)
+    if (htab->entries[i])
+      free (htab->entries[i]);
+
+  free (htab->entries);
+  free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+    - Does not call htab->eq_f when it finds an existing entry.
+    - Does not change the count of elements/searches/collisions in the
+      hash table.
+   This function also assumes there are no deleted entries in the table.
+   HASH is the hash value for the element to be inserted.  */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+  size_t size = htab->size;
+  unsigned int index = hash % size;
+  void **slot = htab->entries + index;
+  int hash2;
+
+  if (! *slot)
+    return slot;
+
+  hash2 = 1 + hash % (size - 2);
+  for (;;)
+    {
+      index += hash2;
+      if (index >= size)
+	index -= size;
+
+      slot = htab->entries + index;
+      if (! *slot)
+	return slot;
+    }
+}
+
+/* The following function changes size of memory allocated for the
+   entries and repeatedly inserts the table elements.  The occupancy
+   of the table after the call will be about 50%.  Naturally the hash
+   table must already exist.  Remember also that the place of the
+   table entries is changed.  If memory allocation failures are allowed,
+   this function will return zero, indicating that the table could not be
+   expanded.  If all goes well, it will return a non-zero value.  */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+  void **oentries;
+  void **olimit;
+  void **p;
+  void **nentries;
+  size_t nsize;
+
+  oentries = htab->entries;
+  olimit = oentries + htab->size;
+
+  /* Resize only when table after removal of unused elements is either
+     too full or too empty.  */
+  if (htab->n_elements * 2 > htab->size)
+    nsize = higher_prime_number (htab->n_elements * 2);
+  else
+    nsize = htab->size;
+
+  nentries = malloc (sizeof (void *) * nsize);
+  memset (nentries, 0, sizeof (void *) * nsize);
+  if (nentries == NULL)
+    return 0;
+  htab->entries = nentries;
+  htab->size = nsize;
+
+  p = oentries;
+  do
+    {
+      if (*p)
+	*find_empty_slot_for_expand (htab, hash_fn (*p))
+	  = *p;
+
+      p++;
+    }
+  while (p < olimit);
+
+#if 0 /* We can't tell whether this was allocated by the malloc()
+	 built into ld.so or the one in the main executable or libc,
+	 and calling free() for something that wasn't malloc()ed could
+	 do Very Bad Things (TM).  Take the conservative approach
+	 here, potentially wasting as much memory as actually used by
+	 the hash table, even if multiple growths occur.  That's not
+	 so bad as to require some overengineered solution that would
+	 enable us to keep track of how it was allocated. */
+  free (oentries);
+#endif
+  return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+   equal to the given element.  To delete an entry, call this with
+   INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+   after doing some checks).  To insert an entry, call this with
+   INSERT = 1, then write the value you want into the returned slot.
+   When inserting an entry, NULL may be returned if memory allocation
+   fails.  */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+		int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+  unsigned int index;
+  int hash, hash2;
+  size_t size;
+  void **entry;
+
+  if (htab->size * 3 <= htab->n_elements * 4
+      && htab_expand (htab, hash_fn) == 0)
+    return NULL;
+
+  hash = hash_fn (ptr);
+
+  size = htab->size;
+  index = hash % size;
+
+  entry = &htab->entries[index];
+  if (!*entry)
+    goto empty_entry;
+  else if (eq_fn (*entry, ptr))
+    return entry;
+      
+  hash2 = 1 + hash % (size - 2);
+  for (;;)
+    {
+      index += hash2;
+      if (index >= size)
+	index -= size;
+      
+      entry = &htab->entries[index];
+      if (!*entry)
+	goto empty_entry;
+      else if (eq_fn (*entry, ptr))
+	return entry;
+    }
+
+ empty_entry:
+  if (!insert)
+    return NULL;
+
+  htab->n_elements++;
+  return entry;
+}
+
+inline static int
+hash_tlsdesc(void *p)
+{
+  struct tlsdesc_dynamic_arg *td = p;
+
+  /* We know all entries are for the same module, so ti_offset is the
+     only distinguishing entry.  */
+  return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+  struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+  return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+  size_t idx = map->l_tls_modid;
+  struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+  /* Find the place in the dtv slotinfo list.  */
+  do
+    {
+      /* Does it fit in the array of this list element?  */
+      if (idx < listp->len)
+	{
+	  /* We should never get here for a module in static TLS, so
+	     we can assume that, if the generation count is zero, we
+	     still haven't determined the generation count for this
+	     module.  */
+	  if (listp->slotinfo[idx].gen)
+	    return listp->slotinfo[idx].gen;
+	  else
+	    break;
+	}
+      idx -= listp->len;
+      listp = listp->next;
+    }
+  while (listp != NULL);
+  
+  /* If we get to this point, the module still hasn't been assigned an
+     entry in the dtv slotinfo data structures, and it will when we're
+     done with relocations.  At that point, the module will get a
+     generation number that is one past the current generation, so
+     return exactly that.  */
+  return GL(dl_tls_generation) + 1;
+}
+
+void *
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+  struct hashtab *ht;
+  void **entry;
+  struct tlsdesc_dynamic_arg *td, test;
+
+  /* FIXME: We could use a per-map lock here, but is it worth it?  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+  ht = map->l_mach.tlsdesc_table;
+  if (! ht)
+    {
+      ht = htab_create ();
+      if (! ht)
+	{
+	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+	  return 0;
+	}
+      map->l_mach.tlsdesc_table = ht;
+    }
+
+  test.tlsinfo.ti_module = map->l_tls_modid;
+  test.tlsinfo.ti_offset = ti_offset;
+  entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+  if (*entry)
+    {
+      td = *entry;
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+      return td;
+    }
+
+  *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+  /* This may be higher than the map's generation, but it doesn't
+     matter much.  Worst case, we'll have one extra DTV update per
+     thread.  */
+  td->gen_count = map_generation (map);
+  td->tlsinfo = test.tlsinfo;
+
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+  return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+   from attempting to resolve the same TLS descriptor without busy
+   waiting.  Ideally, we should be able to release the lock right
+   after changing td->entry, and then using say a condition variable
+   or a futex wake to wake up any waiting threads, but let's try to
+   avoid introducing such dependencies.  */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+  if (caller != td->entry)
+    return 1;
+
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  if (caller != td->entry)
+    {
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+      return 1;
+    }
+
+  td->entry = _dl_tlsdesc_resolve_hold;
+  
+  return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* The following 4 functions take an entry_check_offset argument.
+   It's computed by the caller as an offset between its entry point
+   and the call site, such that by adding the built-in return address
+   that is implicitly passed to the function with this offset, we can
+   easily obtain the caller's entry point to compare with the entry
+   point given in the TLS descriptor.  If it's changed, we want to
+   return immediately.  */
+
+/* These macros are copied from elf/dl-reloc.c */
+
+#define CHECK_STATIC_TLS(map, sym_map)					\
+    do {								\
+      if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET	\
+			    || ((sym_map)->l_tls_offset			\
+				== FORCED_DYNAMIC_TLS_OFFSET), 0))	\
+	_dl_allocate_static_tls (sym_map);				\
+    } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map)					\
+    (__builtin_expect ((sym_map)->l_tls_offset				\
+		       != FORCED_DYNAMIC_TLS_OFFSET, 1)			\
+     && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1)	\
+	 || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+   that reference the *ABS* segment in their own link maps.  The
+   argument is the addend originally stored there.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_abs_plus_addend_fixup (struct tlsdesc volatile *td,
+					   struct link_map *l,
+					   ptrdiff_t entry_check_offset)
+{
+  ptrdiff_t addend = (ptrdiff_t) td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+					  - entry_check_offset))
+    return;
+
+#ifndef SHARED
+  CHECK_STATIC_TLS (l, l);
+#else
+  if (!TRY_STATIC_TLS (l, l))
+    {
+      td->arg = _dl_make_tlsdesc_dynamic (l, addend);
+      td->entry = _dl_tlsdesc_dynamic;
+    }
+  else
+#endif
+    {
+      td->arg = (void*)(addend - l->l_tls_offset);
+      td->entry = _dl_tlsdesc_return;
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+   that originally had zero addends.  The argument location, that
+   originally held the addend, is used to hold a pointer to the
+   relocation, but it has to be restored before we call the function
+   that applies relocations.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rel_fixup (struct tlsdesc volatile *td,
+			       struct link_map *l,
+			       ptrdiff_t entry_check_offset)
+{
+  const ElfW(Rel) *reloc = td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+					  - entry_check_offset))
+    return;
+
+  /* The code below was borrowed from _dl_fixup(),
+     except for checking for STB_LOCAL.  */
+  const ElfW(Sym) *const symtab
+    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+  lookup_t result;
+
+   /* Look up the target symbol.  If the normal lookup rules are not
+      used don't look in the global scope.  */
+  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+    {
+      const struct r_found_version *version = NULL;
+
+      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW(Half) *vernum =
+	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+	  version = &l->l_versions[ndx];
+	  if (version->hash == 0)
+	    version = NULL;
+	}
+
+      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
+    }
+  else
+    {
+      /* We already found the symbol.  The module (and therefore its load
+	 address) is also known.  */
+      result = l;
+    }
+
+#  ifndef SHARED
+  CHECK_STATIC_TLS (l, result);
+#  else
+  if (!TRY_STATIC_TLS (l, result))
+    {
+      td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value);
+      td->entry = _dl_tlsdesc_dynamic;
+    }
+  else
+#  endif
+    {
+      td->arg = (void*)(sym->st_value - result->l_tls_offset);
+      td->entry = _dl_tlsdesc_return;
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+   The argument location is used to hold a pointer to the relocation.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+				struct link_map *l,
+				ptrdiff_t entry_check_offset)
+{
+  const ElfW(Rela) *reloc = td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+					  - entry_check_offset))
+    return;
+
+  /* The code below was borrowed from _dl_fixup(),
+     except for checking for STB_LOCAL.  */
+  const ElfW(Sym) *const symtab
+    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+  lookup_t result;
+
+   /* Look up the target symbol.  If the normal lookup rules are not
+      used don't look in the global scope.  */
+  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+    {
+      const struct r_found_version *version = NULL;
+
+      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW(Half) *vernum =
+	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+	  version = &l->l_versions[ndx];
+	  if (version->hash == 0)
+	    version = NULL;
+	}
+
+      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
+    }
+  else
+    {
+      /* We already found the symbol.  The module (and therefore its load
+	 address) is also known.  */
+      result = l;
+    }
+
+#  ifndef SHARED
+  CHECK_STATIC_TLS (l, result);
+#  else
+  if (!TRY_STATIC_TLS (l, result))
+    {
+      td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+					  + reloc->r_addend);
+      td->entry = _dl_tlsdesc_dynamic;
+    }
+  else
+#  endif
+    {
+      td->arg = (void*)(sym->st_value - result->l_tls_offset
+			+ reloc->r_addend);
+      td->entry = _dl_tlsdesc_return;
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+				struct link_map *l __attribute__((__unused__)),
+				ptrdiff_t entry_check_offset)
+{
+  /* Maybe we're lucky and can return early.  */
+  if (__builtin_return_address (0) - entry_check_offset != td->entry)
+    return;
+
+  /* Locking here will stop execution until the runnign resolver runs
+     _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+     FIXME: We'd be better off waiting on a condition variable, such
+     that we didn't have to hold the lock throughout the relocation
+     processing.  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif /* USE_TLS */
+
+void
+_dl_unmap (struct link_map *map)
+{
+  __munmap ((void *) (map)->l_map_start,
+	    (map)->l_map_end - (map)->l_map_start);
+
+#if USE_TLS && SHARED
+  /* _dl_unmap is only called for dlopen()ed libraries, for which
+     calling free() is safe, or before we've completed the initial
+     relocation, in which case calling free() is probably pointless,
+     but still safe.  */
+  if (map->l_mach.tlsdesc_table)
+    htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/i386/tlsdesc.sym
===================================================================
RCS file: sysdeps/i386/tlsdesc.sym
diff -N sysdeps/i386/tlsdesc.sym
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/i386/tlsdesc.sym 16 Sep 2005 04:24:37 -0000
@@ -0,0 +1,20 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+#if defined USE_TLS
+
+DTV_OFFSET			offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG			offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT		offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+
+#endif
Index: sysdeps/i386/bits/linkmap.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/i386/bits/linkmap.h,v
retrieving revision 1.1
diff -u -p -r1.1 linkmap.h
--- sysdeps/i386/bits/linkmap.h 6 Jan 2005 22:40:17 -0000 1.1
+++ sysdeps/i386/bits/linkmap.h 16 Sep 2005 04:24:37 -0000
@@ -2,4 +2,5 @@ struct link_map_machine
   {
     Elf32_Addr plt; /* Address of .plt + 0x16 */
     Elf32_Addr gotplt; /* Address of .got + 0x0c */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
Index: sysdeps/x86_64/Makefile
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/x86_64/Makefile,v
retrieving revision 1.4
diff -u -p -r1.4 Makefile
--- sysdeps/x86_64/Makefile 16 Aug 2004 06:46:14 -0000 1.4
+++ sysdeps/x86_64/Makefile 16 Sep 2005 04:24:52 -0000
@@ -9,3 +9,13 @@ endif
 ifeq ($(subdir),gmon)
 sysdep_routines += _mcount
 endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/x86_64/dl-lookupcfg.h
===================================================================
RCS file: sysdeps/x86_64/dl-lookupcfg.h
diff -N sysdeps/x86_64/dl-lookupcfg.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/x86_64/dl-lookupcfg.h 16 Sep 2005 04:24:53 -0000
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/x86_64/dl-machine.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/x86_64/dl-machine.h,v
retrieving revision 1.33
diff -u -p -r1.33 dl-machine.h
--- sysdeps/x86_64/dl-machine.h 31 Jul 2005 17:49:44 -0000 1.33
+++ sysdeps/x86_64/dl-machine.h 16 Sep 2005 04:24:55 -0000
@@ -26,6 +26,7 @@
 #include <sys/param.h>
 #include <sysdep.h>
 #include <tls.h>
+#include <dl-tlsdesc.h>
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
 static inline int __attribute__ ((unused))
@@ -131,6 +132,10 @@ elf_machine_runtime_setup (struct link_m
 	got[2] = (Elf64_Addr) &_dl_runtime_resolve;
     }
 
+  if (l->l_info[VERSYMIDX (DT_TLSDESC_GOT)] && lazy)
+    *(Elf64_Addr*)D_PTR (l, l_info[VERSYMIDX (DT_TLSDESC_GOT)])
+      = (Elf64_Addr) &_dl_tlsdesc_resolve_rela;
+
   return lazy;
 }
 
@@ -194,7 +199,9 @@ _dl_start_user:\n\
 # define elf_machine_type_class(type)					      \
   ((((type) == R_X86_64_JUMP_SLOT					      \
      || (type) == R_X86_64_DTPMOD64					      \
-     || (type) == R_X86_64_DTPOFF64 || (type) == R_X86_64_TPOFF64)	      \
+     || (type) == R_X86_64_DTPOFF64					      \
+     || (type) == R_X86_64_TPOFF64					      \
+     || (type) == R_X86_64_TLSDESC)					      \
     * ELF_RTYPE_CLASS_PLT)						      \
    | (((type) == R_X86_64_COPY) * ELF_RTYPE_CLASS_COPY))
 #else
@@ -209,6 +216,9 @@ _dl_start_user:\n\
 /* The x86-64 never uses Elf64_Rel relocations.  */
 #define ELF_MACHINE_NO_REL 1
 
+/* Adjust DT_TLSDESC_GOT and DT_TLSDESC_PLT with load address.  */
+#define ELF_MACHINE_DT_TLSDESC 1
+
 /* We define an initialization functions.  This is called very early in
    _dl_sysdep_start.  */
 #define DL_PLATFORM_INIT dl_platform_init ()
@@ -323,6 +333,34 @@ elf_machine_rela (struct link_map *map, 
 	    *reloc_addr = sym->st_value + reloc->r_addend;
 # endif
 	  break;
+	case R_X86_64_TLSDESC:
+# ifndef RTLD_BOOTSTRAP
+	  if (sym != NULL)
+# endif
+	    {
+	      struct tlsdesc volatile *td =
+		(struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+#  ifndef SHARED
+	      CHECK_STATIC_TLS (map, sym_map);
+#  else
+	      if (!TRY_STATIC_TLS (map, sym_map))
+		{
+		  td->arg = _dl_make_tlsdesc_dynamic
+		    (sym_map, sym->st_value + (intptr_t)td->arg);
+		  td->entry = _dl_tlsdesc_dynamic;
+		}
+	      else
+#  endif
+# endif
+		{
+		  td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+				    + (intptr_t)td->arg);
+		  td->entry = _dl_tlsdesc_return;
+		}
+	    }
+	  break;
 	case R_X86_64_TPOFF64:
 	  /* The offset is negative, forward from the thread pointer.  */
 # ifndef RTLD_BOOTSTRAP
@@ -435,6 +473,14 @@ elf_machine_lazy_rel (struct link_map *m
 	  map->l_mach.plt
 	  + (((Elf64_Addr) reloc_addr) - map->l_mach.gotplt) * 2;
     }
+  else if (__builtin_expect (r_type == R_X86_64_TLSDESC, 1))
+    {
+      struct tlsdesc volatile * __attribute__((__unused__)) td =
+	(struct tlsdesc volatile *)reloc_addr;
+
+      td->arg = (void*)reloc;
+      td->entry = (void*)D_PTR (map, l_info[VERSYMIDX (DT_TLSDESC_PLT)]);
+    }
   else
     _dl_reloc_bad_type (map, r_type, 1);
 }
Index: sysdeps/x86_64/dl-tls.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/x86_64/dl-tls.h,v
retrieving revision 1.1
diff -u -p -r1.1 dl-tls.h
--- sysdeps/x86_64/dl-tls.h 29 Sep 2002 10:38:20 -0000 1.1
+++ sysdeps/x86_64/dl-tls.h 16 Sep 2005 04:24:55 -0000
@@ -1,5 +1,5 @@
 /* Thread-local storage handling in the ELF dynamic linker.  x86-64 version.
-   Copyright (C) 2002 Free Software Foundation, Inc.
+   Copyright (C) 2002, 2005 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -19,11 +19,13 @@
 
 
 /* Type used for the representation of TLS information in the GOT.  */
-typedef struct
+typedef struct dl_tls_index
 {
   unsigned long int ti_module;
   unsigned long int ti_offset;
 } tls_index;
 
 
+#ifdef SHARED
 extern void *__tls_get_addr (tls_index *ti);
+#endif
Index: sysdeps/x86_64/dl-tlsdesc.S
===================================================================
RCS file: sysdeps/x86_64/dl-tlsdesc.S
diff -N sysdeps/x86_64/dl-tlsdesc.S
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/x86_64/dl-tlsdesc.S 16 Sep 2005 04:24:56 -0000
@@ -0,0 +1,196 @@
+/* Thread-local storage handling in the ELF dynamic linker.  x86_64 version.
+   Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+	.text
+#ifdef USE_TLS
+	.hidden _dl_tlsdesc_return
+	.global	_dl_tlsdesc_return
+	.type	_dl_tlsdesc_return,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_return:
+	movq	8(%rax), %rax
+	ret
+	cfi_endproc
+	.size	_dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+#ifdef SHARED
+	.hidden _dl_tlsdesc_dynamic
+	.global	_dl_tlsdesc_dynamic
+	.type	_dl_tlsdesc_dynamic,@function
+	
+     /* %rax points to the TLS descriptor, such that 0(%rax) points to
+	_dl_tlsdesc_dynamic itself, and 8(%rax) points to a struct
+	tlsdesc_dynamic_arg object.  It must return in %rax the offset
+	between the thread pointer and the object denoted by the
+	argument, without clobbering any registers.
+
+	The assembly code that follows is a rendition of the following
+	C code, hand-optimized a little bit.
+	
+ptrdiff_t
+_dl_tlsdesc_dynamic (register struct tlsdesc *tdp asm ("%rax")) 
+{
+  struct tlsdesc_dynamic_arg *td = tdp->arg; 
+  dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+  if (__builtin_expect (td->gen_count <= dtv[0].counter
+			&& (dtv[td->tlsinfo.ti_module].pointer.val
+			    != TLS_DTV_UNALLOCATED),
+			1))
+    return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+      - __thread_pointer;
+
+  return __tls_get_addr_internal (&td->tlsinfo) - __thread_pointer;
+}
+*/
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_dynamic:
+	/* Preserve call-clobbered registers that we modify.
+	   We need two scratch regs anyway.  */
+	pushq	%rsi
+	cfi_adjust_cfa_offset (8)
+	movq	%fs:DTV_OFFSET, %rsi
+	pushq	%rdi
+	cfi_adjust_cfa_offset (8)
+	movq	TLSDESC_ARG(%rax), %rdi
+	movq	(%rsi), %rax
+	cmpq	%rax, TLSDESC_GEN_COUNT(%rdi)
+	ja	.Lslow
+	movq	TLSDESC_MODID(%rdi), %rax
+	salq	$4, %rax
+	movq	(%rax,%rsi), %rax
+	cmpq	$-1, %rax
+	je	.Lslow
+	addq	TLSDESC_MODOFF(%rdi), %rax
+.Lret:
+	popq	%rdi
+	cfi_adjust_cfa_offset (-8)
+	subq	%fs:0, %rax
+	popq	%rsi
+	cfi_adjust_cfa_offset (-8)
+	ret
+.Lslow:
+	/* Besides rdi and rsi, saved above, save rdx, rcx, r8, r9,
+	   r10 and r11.  Also, align the stack, that's off by 8 bytes.	*/
+	cfi_adjust_cfa_offset (16)
+	subq	$56, %rsp
+	cfi_adjust_cfa_offset (56)
+	movq	%rdx, 8(%rsp)
+	movq	%rcx, 16(%rsp)
+	movq	%r8, 24(%rsp)
+	movq	%r9, 32(%rsp)
+	movq	%r10, 40(%rsp)
+	movq	%r11, 48(%rsp)
+	/* %rdi already points to the tlsinfo data structure.  */
+	call	__tls_get_addr@PLT
+	movq	8(%rsp), %rdx
+	movq	16(%rsp), %rcx
+	movq	24(%rsp), %r8
+	movq	32(%rsp), %r9
+	movq	40(%rsp), %r10
+	movq	48(%rsp), %r11
+	addq	$56, %rsp
+	cfi_adjust_cfa_offset (-56)
+	jmp	.Lret
+	cfi_endproc
+	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+	
+	.hidden _dl_tlsdesc_resolve_rela
+	.global	_dl_tlsdesc_resolve_rela
+	.type	_dl_tlsdesc_resolve_rela,@function
+	cfi_startproc
+	.align 16
+	/* The PLT entry will have pushed the link_map pointer.  */
+	cfi_adjust_cfa_offset (8)
+_dl_tlsdesc_resolve_rela:
+	/* Save all call-clobbered registers.  */
+	subq	$72, %rsp
+	cfi_adjust_cfa_offset (72)	
+	movq	%rax, (%rsp)
+	movq	%rdi, 8(%rsp)
+	movq	%rax, %rdi	/* Pass tlsdesc* in %rdi.  */
+	movq	%rsi, 16(%rsp)
+	movq	72(%rsp), %rsi	/* Pass link_map* in %rsi.  */
+	movq	%r8, 24(%rsp)
+	movq	%r9, 32(%rsp)
+	movq	%r10, 40(%rsp)
+	movq	%r11, 48(%rsp)
+	movq	%rdx, 56(%rsp)
+	movq	%rcx, 64(%rsp)
+	call	_dl_tlsdesc_resolve_rela_fixup
+	movq	(%rsp), %rax
+	movq	8(%rsp), %rdi
+	movq	16(%rsp), %rsi
+	movq	24(%rsp), %r8
+	movq	32(%rsp), %r9
+	movq	40(%rsp), %r10
+	movq	48(%rsp), %r11
+	movq	56(%rsp), %rdx
+	movq	64(%rsp), %rcx
+	addq	$80, %rsp
+	cfi_adjust_cfa_offset (-80)
+	jmp	*(%rax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+	.hidden _dl_tlsdesc_resolve_hold
+	.global	_dl_tlsdesc_resolve_hold
+	.type	_dl_tlsdesc_resolve_hold,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_hold:
+0:
+	/* Save all call-clobbered registers.  */
+	subq	$72, %rsp
+	cfi_adjust_cfa_offset (72)	
+	movq	%rax, (%rsp)
+	movq	%rdi, 8(%rsp)
+	movq	%rax, %rdi	/* Pass tlsdesc* in %rdi.  */
+	movq	%rsi, 16(%rsp)
+	movq	$1f - 0b, %rsi	/* Pass return address offset in %rsi.  */
+	movq	%r8, 24(%rsp)
+	movq	%r9, 32(%rsp)
+	movq	%r10, 40(%rsp)
+	movq	%r11, 48(%rsp)
+	movq	%rdx, 56(%rsp)
+	movq	%rcx, 64(%rsp)
+	call	_dl_tlsdesc_resolve_hold_fixup
+1:
+	movq	(%rsp), %rax
+	movq	8(%rsp), %rdi
+	movq	16(%rsp), %rsi
+	movq	24(%rsp), %r8
+	movq	32(%rsp), %r9
+	movq	40(%rsp), %r10
+	movq	48(%rsp), %r11
+	movq	56(%rsp), %rdx
+	movq	64(%rsp), %rcx
+	addq	$72, %rsp
+	cfi_adjust_cfa_offset (-72)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
+	
+#endif /* USE_TLS */
Index: sysdeps/x86_64/dl-tlsdesc.h
===================================================================
RCS file: sysdeps/x86_64/dl-tlsdesc.h
diff -N sysdeps/x86_64/dl-tlsdesc.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/x86_64/dl-tlsdesc.h 16 Sep 2005 04:24:56 -0000
@@ -0,0 +1,61 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+   x86_64 version.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _X86_64_DL_TLSDESC_H
+# define _X86_64_DL_TLSDESC_H 1
+
+/* Use this to access DT_TLSDESC_PLT and DT_TLSDESC_GOT.  */
+#ifndef VERSYMIDX
+# define VERSYMIDX(sym)	(DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGIDX (sym))
+#endif
+
+/* Type used to represent a TLS descriptor in the GOT.  */
+struct tlsdesc
+{
+  ptrdiff_t (*entry)(struct tlsdesc *on_rax);
+  void *arg;
+};
+
+typedef struct dl_tls_index
+{
+  unsigned long int ti_module;
+  unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+   needs dynamic TLS offsets.  */
+struct tlsdesc_dynamic_arg
+{
+  tls_index tlsinfo;
+  size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden
+  _dl_tlsdesc_return(struct tlsdesc *on_rax),
+  _dl_tlsdesc_resolve_rela(struct tlsdesc *on_rax),
+  _dl_tlsdesc_resolve_hold(struct tlsdesc *on_rax);
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/x86_64/tlsdesc.c
===================================================================
RCS file: sysdeps/x86_64/tlsdesc.c
diff -N sysdeps/x86_64/tlsdesc.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/x86_64/tlsdesc.c 16 Sep 2005 04:24:56 -0000
@@ -0,0 +1,548 @@
+/* Manage TLS descriptors.  x86_64 version.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+
+#ifdef USE_TLS
+# ifdef SHARED
+
+extern void weak_function free (void *ptr);
+
+/* The hashcode handling code below is heavily inspired in libiberty's
+   hashtab code, but with most adaptation points and support for
+   deleting elements removed.
+
+   Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+   Contributed by Vladimir Makarov (vmakarov@cygnus.com).  */
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+  /* These are primes that are near, but slightly smaller than, a
+     power of two.  */
+  static const unsigned long primes[] = {
+    (unsigned long) 7,
+    (unsigned long) 13,
+    (unsigned long) 31,
+    (unsigned long) 61,
+    (unsigned long) 127,
+    (unsigned long) 251,
+    (unsigned long) 509,
+    (unsigned long) 1021,
+    (unsigned long) 2039,
+    (unsigned long) 4093,
+    (unsigned long) 8191,
+    (unsigned long) 16381,
+    (unsigned long) 32749,
+    (unsigned long) 65521,
+    (unsigned long) 131071,
+    (unsigned long) 262139,
+    (unsigned long) 524287,
+    (unsigned long) 1048573,
+    (unsigned long) 2097143,
+    (unsigned long) 4194301,
+    (unsigned long) 8388593,
+    (unsigned long) 16777213,
+    (unsigned long) 33554393,
+    (unsigned long) 67108859,
+    (unsigned long) 134217689,
+    (unsigned long) 268435399,
+    (unsigned long) 536870909,
+    (unsigned long) 1073741789,
+    (unsigned long) 2147483647,
+					/* 4294967291L */
+    ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+  };
+
+  const unsigned long *low = &primes[0];
+  const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+  while (low != high)
+    {
+      const unsigned long *mid = low + (high - low) / 2;
+      if (n > *mid)
+	low = mid + 1;
+      else
+	high = mid;
+    }
+
+#if 0
+  /* If we've run out of primes, abort.  */
+  if (n > *low)
+    {
+      fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+      abort ();
+    }
+#endif
+
+  return *low;
+}
+
+struct hashtab
+{
+  /* Table itself.  */
+  void **entries;
+
+  /* Current size (in entries) of the hash table */
+  size_t size;
+
+  /* Current number of elements.  */
+  size_t n_elements;
+};  
+
+inline static struct hashtab *
+htab_create (void)
+{
+  struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+  if (! ht)
+    return NULL;
+  ht->size = 3;
+  ht->entries = malloc (sizeof (void *) * ht->size);
+  if (! ht->entries)
+    return NULL;
+  
+  ht->n_elements = 0;
+
+  memset (ht->entries, 0, sizeof (void *) * ht->size);
+  
+  return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+   free().  See the discussion below.  */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+  int i;
+
+  for (i = htab->size - 1; i >= 0; i--)
+    if (htab->entries[i])
+      free (htab->entries[i]);
+
+  free (htab->entries);
+  free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+    - Does not call htab->eq_f when it finds an existing entry.
+    - Does not change the count of elements/searches/collisions in the
+      hash table.
+   This function also assumes there are no deleted entries in the table.
+   HASH is the hash value for the element to be inserted.  */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+  size_t size = htab->size;
+  unsigned int index = hash % size;
+  void **slot = htab->entries + index;
+  int hash2;
+
+  if (! *slot)
+    return slot;
+
+  hash2 = 1 + hash % (size - 2);
+  for (;;)
+    {
+      index += hash2;
+      if (index >= size)
+	index -= size;
+
+      slot = htab->entries + index;
+      if (! *slot)
+	return slot;
+    }
+}
+
+/* The following function changes size of memory allocated for the
+   entries and repeatedly inserts the table elements.  The occupancy
+   of the table after the call will be about 50%.  Naturally the hash
+   table must already exist.  Remember also that the place of the
+   table entries is changed.  If memory allocation failures are allowed,
+   this function will return zero, indicating that the table could not be
+   expanded.  If all goes well, it will return a non-zero value.  */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+  void **oentries;
+  void **olimit;
+  void **p;
+  void **nentries;
+  size_t nsize;
+
+  oentries = htab->entries;
+  olimit = oentries + htab->size;
+
+  /* Resize only when table after removal of unused elements is either
+     too full or too empty.  */
+  if (htab->n_elements * 2 > htab->size)
+    nsize = higher_prime_number (htab->n_elements * 2);
+  else
+    nsize = htab->size;
+
+  nentries = malloc (sizeof (void *) * nsize);
+  memset (nentries, 0, sizeof (void *) * nsize);
+  if (nentries == NULL)
+    return 0;
+  htab->entries = nentries;
+  htab->size = nsize;
+
+  p = oentries;
+  do
+    {
+      if (*p)
+	*find_empty_slot_for_expand (htab, hash_fn (*p))
+	  = *p;
+
+      p++;
+    }
+  while (p < olimit);
+
+#if 0 /* We can't tell whether this was allocated by the malloc()
+	 built into ld.so or the one in the main executable or libc,
+	 and calling free() for something that wasn't malloc()ed could
+	 do Very Bad Things (TM).  Take the conservative approach
+	 here, potentially wasting as much memory as actually used by
+	 the hash table, even if multiple growths occur.  That's not
+	 so bad as to require some overengineered solution that would
+	 enable us to keep track of how it was allocated. */
+  free (oentries);
+#endif
+  return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+   equal to the given element.  To delete an entry, call this with
+   INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+   after doing some checks).  To insert an entry, call this with
+   INSERT = 1, then write the value you want into the returned slot.
+   When inserting an entry, NULL may be returned if memory allocation
+   fails.  */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+		int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+  unsigned int index;
+  int hash, hash2;
+  size_t size;
+  void **entry;
+
+  if (htab->size * 3 <= htab->n_elements * 4
+      && htab_expand (htab, hash_fn) == 0)
+    return NULL;
+
+  hash = hash_fn (ptr);
+
+  size = htab->size;
+  index = hash % size;
+
+  entry = &htab->entries[index];
+  if (!*entry)
+    goto empty_entry;
+  else if (eq_fn (*entry, ptr))
+    return entry;
+      
+  hash2 = 1 + hash % (size - 2);
+  for (;;)
+    {
+      index += hash2;
+      if (index >= size)
+	index -= size;
+      
+      entry = &htab->entries[index];
+      if (!*entry)
+	goto empty_entry;
+      else if (eq_fn (*entry, ptr))
+	return entry;
+    }
+
+ empty_entry:
+  if (!insert)
+    return NULL;
+
+  htab->n_elements++;
+  return entry;
+}
+
+inline static int
+hash_tlsdesc(void *p)
+{
+  struct tlsdesc_dynamic_arg *td = p;
+
+  /* We know all entries are for the same module, so ti_offset is the
+     only distinguishing entry.  */
+  return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+  struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+  return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+  size_t idx = map->l_tls_modid;
+  struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+  /* Find the place in the dtv slotinfo list.  */
+  do
+    {
+      /* Does it fit in the array of this list element?  */
+      if (idx < listp->len)
+	{
+	  /* We should never get here for a module in static TLS, so
+	     we can assume that, if the generation count is zero, we
+	     still haven't determined the generation count for this
+	     module.  */
+	  if (listp->slotinfo[idx].gen)
+	    return listp->slotinfo[idx].gen;
+	  else
+	    break;
+	}
+      idx -= listp->len;
+      listp = listp->next;
+    }
+  while (listp != NULL);
+  
+  /* If we get to this point, the module still hasn't been assigned an
+     entry in the dtv slotinfo data structures, and it will when we're
+     done with relocations.  At that point, the module will get a
+     generation number that is one past the current generation, so
+     return exactly that.  */
+  return GL(dl_tls_generation) + 1;
+}
+
+void *
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+  struct hashtab *ht;
+  void **entry;
+  struct tlsdesc_dynamic_arg *td, test;
+
+  /* FIXME: We could use a per-map lock here, but is it worth it?  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+  ht = map->l_mach.tlsdesc_table;
+  if (! ht)
+    {
+      ht = htab_create ();
+      if (! ht)
+	{
+	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+	  return 0;
+	}
+      map->l_mach.tlsdesc_table = ht;
+    }
+
+  test.tlsinfo.ti_module = map->l_tls_modid;
+  test.tlsinfo.ti_offset = ti_offset;
+  entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+  if (*entry)
+    {
+      td = *entry;
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+      return td;
+    }
+
+  *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+  /* This may be higher than the map's generation, but it doesn't
+     matter much.  Worst case, we'll have one extra DTV update per
+     thread.  */
+  td->gen_count = map_generation (map);
+  td->tlsinfo = test.tlsinfo;
+
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+  return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+   from attempting to resolve the same TLS descriptor without busy
+   waiting.  Ideally, we should be able to release the lock right
+   after changing td->entry, and then using say a condition variable
+   or a futex wake to wake up any waiting threads, but let's try to
+   avoid introducing such dependencies.  */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+  if (caller != td->entry)
+    return 1;
+
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  if (caller != td->entry)
+    {
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+      return 1;
+    }
+
+  td->entry = _dl_tlsdesc_resolve_hold;
+  
+  return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* The following 2 functions take an entry_check_offset argument.
+   It's computed by the caller as an offset between its entry point
+   and the call site, such that by adding the built-in return address
+   that is implicitly passed to the function with this offset, we can
+   easily obtain the caller's entry point to compare with the entry
+   point given in the TLS descriptor.  If it's changed, we want to
+   return immediately.  */
+
+/* These macros are copied from elf/dl-reloc.c */
+
+#define CHECK_STATIC_TLS(map, sym_map)					\
+    do {								\
+      if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET	\
+			    || ((sym_map)->l_tls_offset			\
+				== FORCED_DYNAMIC_TLS_OFFSET), 0))	\
+	_dl_allocate_static_tls (sym_map);				\
+    } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map)					\
+    (__builtin_expect ((sym_map)->l_tls_offset				\
+		       != FORCED_DYNAMIC_TLS_OFFSET, 1)			\
+     && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1)	\
+	 || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+   The argument location is used to hold a pointer to the relocation.  */
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+				struct link_map *l)
+{
+  const ElfW(Rela) *reloc = td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p
+      (td, (void*)D_PTR (l, l_info[VERSYMIDX (DT_TLSDESC_PLT)])))
+    return;
+
+  /* The code below was borrowed from _dl_fixup().  */
+  const ElfW(Sym) *const symtab
+    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+  lookup_t result;
+
+   /* Look up the target symbol.  If the normal lookup rules are not
+      used don't look in the global scope.  */
+  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+    {
+      const struct r_found_version *version = NULL;
+
+      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW(Half) *vernum =
+	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+	  version = &l->l_versions[ndx];
+	  if (version->hash == 0)
+	    version = NULL;
+	}
+
+      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
+    }
+  else
+    {
+      /* We already found the symbol.  The module (and therefore its load
+	 address) is also known.  */
+      result = l;
+    }
+
+#  ifndef SHARED
+  CHECK_STATIC_TLS (l, result);
+#  else
+  if (!TRY_STATIC_TLS (l, result))
+    {
+      td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+					  + reloc->r_addend);
+      td->entry = _dl_tlsdesc_dynamic;
+    }
+  else
+#  endif
+    {
+      td->arg = (void*)(sym->st_value - result->l_tls_offset
+			+ reloc->r_addend);
+      td->entry = _dl_tlsdesc_return;
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+				ptrdiff_t entry_check_offset)
+{
+  /* Maybe we're lucky and can return early.  */
+  if (__builtin_return_address (0) - entry_check_offset != td->entry)
+    return;
+
+  /* Locking here will stop execution until the runnign resolver runs
+     _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+     FIXME: We'd be better off waiting on a condition variable, such
+     that we didn't have to hold the lock throughout the relocation
+     processing.  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif /* USE_TLS */
+
+void
+_dl_unmap (struct link_map *map)
+{
+  __munmap ((void *) (map)->l_map_start,
+	    (map)->l_map_end - (map)->l_map_start);
+
+#if USE_TLS && SHARED
+  /* _dl_unmap is only called for dlopen()ed libraries, for which
+     calling free() is safe, or before we've completed the initial
+     relocation, in which case calling free() is probably pointless,
+     but still safe.  */
+  if (map->l_mach.tlsdesc_table)
+    htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/x86_64/tlsdesc.sym
===================================================================
RCS file: sysdeps/x86_64/tlsdesc.sym
diff -N sysdeps/x86_64/tlsdesc.sym
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ sysdeps/x86_64/tlsdesc.sym 16 Sep 2005 04:24:56 -0000
@@ -0,0 +1,20 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+#if defined USE_TLS
+
+DTV_OFFSET			offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG			offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT		offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+
+#endif
Index: sysdeps/x86_64/bits/linkmap.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/x86_64/bits/linkmap.h,v
retrieving revision 1.1
diff -u -p -r1.1 linkmap.h
--- sysdeps/x86_64/bits/linkmap.h 6 Jan 2005 22:40:14 -0000 1.1
+++ sysdeps/x86_64/bits/linkmap.h 16 Sep 2005 04:24:56 -0000
@@ -3,6 +3,7 @@ struct link_map_machine
   {
     Elf64_Addr plt; /* Address of .plt + 0x16 */
     Elf64_Addr gotplt; /* Address of .got + 0x18 */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
 
 #else
@@ -10,5 +11,6 @@ struct link_map_machine
   {
     Elf32_Addr plt; /* Address of .plt + 0x16 */
     Elf32_Addr gotplt; /* Address of .got + 0x0c */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
 #endif
-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]