This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: dynsym/dynstr/relocs sorting speedup #2


Hi Nick,

	Have re-worked the patch based on your comments:

On Mon, 2006-01-16 at 17:07 +0000, Nick Clifton wrote:
> >         It seems that by sorting dynsym & dynstr and relocations by
> > (elf_hash % bucketcount) we can easily get a 5%+ speedup for linking.
> 
> The patch itself looks fairly reasonable.  There are some formatting 
> issues, but they are not important right now.  One glaring thing though 
> is the use of the DONT_SORT_SYMS environment variable.  This is a no-no. 
>   A command line option should be used instead.  (I assume that you used 
> the environment variable as a quick way to get performance figures 
> without having to add the command line code).

	Quite - I've used '-z dynsort' & added a terse description to the docs;
not clear if that's the best name / place for the option (they seem to
be i386 only), but - perhaps it's ok.

> Hmm - this sounds like a good idea, but I am not sure if we could always 
> rely upon this value having been calculated for us when we need it.  I 
> would suggest reserving this to a future optimization of this code.

	Fine - I have a plot brewing to remove 'undef' symbols from the hash
(but not chain), and sort their syms to the end of .dynstr - it's seems
crazy to have ~1/5th of symbols which no-one ever wants to lookup in the
hash chains; anyhow - that's for another day & will shake that out.

> >                 * qsort - has no closure, nasty global variable for
> >                           bucket count - how should that be fixed ?
> 
> By writing your own version of qsort ?  Or by creating a new structure 
> that holds an "elf_link_hash_entry **" hash entry and a "size_t" bucket 
> size and passing pointers to this structure as arguments to qsort.

	The attached patch cut/paste/adds a closure to the glibc qsort function
- 'glib' does a similar thing (gqsort.c) ; it's a small piece of code,
and clearly not having a module global for that is more pleasant; is
that good enough ?

> No idea squire.  A little bit of experimentation would be a god idea here.

	Sure - will do; where do we go from here ?

	Thanks,

		Michael.

-- 
 michael.meeks@novell.com  <><, Pseudo Engineer, itinerant idiot
--- binutils-2.16.91.0.4/bfd/ChangeLog	2005-11-13 17:16:34.000000000 +0000
+++ binutils.current/bfd/ChangeLog	2006-01-18 17:33:24.000000000 +0000
@@ -1,3 +1,40 @@
+2006-01-18  Michael Meeks  <michael.meeks@novell.com>
+
+	* bfdsort.c: import qsort.c from glibc-2.3 and re-work to
+	provide missing closure argument, for more elegant sorting.
+
+	* elf-strtab.c (_bfd_elf_strtab_init, _bfd_elf_strtab_free):
+	manage the sorted array.
+	(_bfd_elf_strtab_emit): emit .dynstr in sorted order.
+	(_bfd_elf_strtab_finalize, hash_compare): sort .dynstr
+	primarily by elf bucket index.
+
+	* elflink.c (elf_finalize_dynstr): pass bucket count to
+	strtab_finalize.
+	(_bfd_elf_ver_hash): split for re-use from
+	(elf_collect_hash_codes): here.
+	(_bfd_elf_link_hash_traverse): new traversal function:
+	walks sorted array if sorted, else walks the hash as before.
+	(elf_sort_dynsym_hash): qsort method to sort dynsyms
+	(elf_sort_collect_dynsyms): hash walker to collect symbols
+	(_bfd_elf_sort_dynsyms): method - sorts .dynsym section.
+	(bfd_elf_size_dynsym_hash_dynstr): do the .dynsym sort.
+	(elf_link_sort_relocs, elf_link_sort_cmp2): sort relocs
+	primarily by elf bucket index.
+	(bfd_elf_final_link): update reloc sort call.
+
+	* elf.c (_bfd_elf_link_hash_table_init),
+	(_bfd_elf_link_hash_table_free): init & free sorted array.
+	(assign_section_numbers): upd. strtab_finalize call.
+
+	* elf32-m68hc1x.c (m68hc11_elf_bfd_link_hash_table_free),
+	* elf-m10300.c (elf32_mn10300_link_hash_table_free),
+	* elf32-hppa.c (elf32_hppa_link_hash_table_free),
+	* elf64-ppc.c (ppc64_elf_link_hash_table_free),
+	* elf.c (_bfd_elf_link_hash_table_free),
+	* elfxx-target.h: implement elf specific hash free
+	method & re-direct callers to it.
+
 2005-11-11  Nick Clifton  <nickc@redhat.com>
 
 	PR 1150

--- binutils-2.16.91.0.4/bfd/bfd.h	2005-12-23 17:03:29.000000000 +0000
+++ binutils.current/bfd/bfd.h	2006-01-18 15:48:46.000000000 +0000
@@ -639,6 +639,9 @@
   DYN_NO_NEEDED = 8
 };
 
+typedef int(*bfd_qsort_closure_func)(const void *, const void *, const void *);
+extern void bfd_qsort (void *base, bfd_size_type nmemb, bfd_size_type size,
+		       bfd_qsort_closure_func cmp, void *closure);
 extern bfd_boolean bfd_elf_record_link_assignment
   (struct bfd_link_info *, const char *, bfd_boolean);
 extern struct bfd_link_needed_list *bfd_elf_get_needed_list
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/ChangeLog binutils.current/bfd/ChangeLog
Files binutils-2.16.91.0.4/bfd/doc/chew and binutils.current/bfd/doc/chew differ
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf32-hppa.c binutils.current/bfd/elf32-hppa.c
--- binutils-2.16.91.0.4/bfd/elf32-hppa.c	2005-11-13 17:16:34.000000000 +0000
+++ binutils.current/bfd/elf32-hppa.c	2005-12-23 14:06:31.000000000 +0000
@@ -435,7 +435,7 @@
     = (struct elf32_hppa_link_hash_table *) btab;
 
   bfd_hash_table_free (&htab->bstab);
-  _bfd_generic_link_hash_table_free (btab);
+  _bfd_elf_link_hash_table_free (hash);
 }
 
 /* Build a name for an entry in the stub hash table.  */
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf32-m68hc1x.c binutils.current/bfd/elf32-m68hc1x.c
--- binutils-2.16.91.0.4/bfd/elf32-m68hc1x.c	2005-06-22 21:53:34.000000000 +0100
+++ binutils.current/bfd/elf32-m68hc1x.c	2005-12-23 14:07:21.000000000 +0000
@@ -106,7 +106,7 @@
 
   bfd_hash_table_free (ret->stub_hash_table);
   free (ret->stub_hash_table);
-  _bfd_generic_link_hash_table_free (hash);
+  _bfd_elf_link_hash_table_free (hash);
 }
 
 /* Assorted hash table functions.  */
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf32-target.h binutils.current/bfd/elf32-target.h
--- binutils-2.16.91.0.4/bfd/elf32-target.h	2005-12-23 17:12:23.000000000 +0000
+++ binutils.current/bfd/elf32-target.h	2005-12-23 14:51:28.000000000 +0000
@@ -205,7 +205,7 @@
 #endif
 
 #ifndef bfd_elf32_bfd_link_hash_table_free
-#define bfd_elf32_bfd_link_hash_table_free _bfd_generic_link_hash_table_free
+#define bfd_elf32_bfd_link_hash_table_free _bfd_elf_link_hash_table_free
 #endif
 
 #ifdef elf_backend_relocate_section
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf64-ppc.c binutils.current/bfd/elf64-ppc.c
--- binutils-2.16.91.0.4/bfd/elf64-ppc.c	2005-11-13 17:16:34.000000000 +0000
+++ binutils.current/bfd/elf64-ppc.c	2005-12-23 14:04:07.000000000 +0000
@@ -3502,7 +3502,7 @@
 
   bfd_hash_table_free (&ret->stub_hash_table);
   bfd_hash_table_free (&ret->branch_hash_table);
-  _bfd_generic_link_hash_table_free (hash);
+  _bfd_elf_link_hash_table_free (hash);
 }
 
 /* Satisfy the ELF linker by filling in some fields in our fake bfd.  */
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf-bfd.h binutils.current/bfd/elf-bfd.h
--- binutils-2.16.91.0.4/bfd/elf-bfd.h	2005-12-22 10:42:52.000000000 +0000
+++ binutils.current/bfd/elf-bfd.h	2005-12-23 14:59:54.000000000 +0000
@@ -340,6 +340,10 @@
 {
   struct bfd_link_hash_table root;
 
+  /* Symbol sort order for final traversal at output */
+  unsigned int sorted_size;
+  struct elf_link_hash_entry **sorted;
+
   /* Whether we have created the special dynamic sections required
      when linking against or generating a shared object.  */
   bfd_boolean dynamic_sections_created;
@@ -418,11 +422,16 @@
 /* Traverse an ELF linker hash table.  */
 
 #define elf_link_hash_traverse(table, func, info)			\
-  (bfd_link_hash_traverse						\
-   (&(table)->root,							\
-    (bfd_boolean (*) (struct bfd_link_hash_entry *, void *)) (func),	\
+  (_bfd_elf_link_hash_traverse						\
+   ((table),								\
+    (bfd_boolean (*) (struct elf_link_hash_entry *, void *)) (func),	\
     (info)))
 
+void _bfd_elf_link_hash_traverse
+  (struct elf_link_hash_table *table,
+   bfd_boolean (*func) (struct elf_link_hash_entry *, void *),
+   void *info);
+
 /* Get the ELF linker hash table from a link_info structure.  */
 
 #define elf_hash_table(p) ((struct elf_link_hash_table *) ((p)->hash))
@@ -1446,6 +1455,8 @@
 
 extern unsigned long bfd_elf_hash
   (const char *);
+extern unsigned long _bfd_elf_ver_hash
+  (const char *);
 
 extern bfd_reloc_status_type bfd_elf_generic_reloc
   (bfd *, arelent *, asymbol *, void *, asection *, bfd *, char **);
@@ -1463,6 +1474,7 @@
   (struct bfd_hash_entry *, struct bfd_hash_table *, const char *);
 extern struct bfd_link_hash_table *_bfd_elf_link_hash_table_create
   (bfd *);
+extern void _bfd_elf_link_hash_table_free (struct bfd_link_hash_table *);
 extern void _bfd_elf_link_hash_copy_indirect
   (struct bfd_link_info *, struct elf_link_hash_entry *,
    struct elf_link_hash_entry *);
@@ -1593,7 +1605,7 @@
 extern bfd_boolean _bfd_elf_strtab_emit
   (bfd *, struct elf_strtab_hash *);
 extern void _bfd_elf_strtab_finalize
-  (struct elf_strtab_hash *);
+  (struct elf_strtab_hash *, size_t);
 
 extern bfd_boolean _bfd_elf_discard_section_eh_frame
   (bfd *, struct bfd_link_info *, asection *,
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf.c binutils.current/bfd/elf.c
--- binutils-2.16.91.0.4/bfd/elf.c	2005-12-22 10:42:52.000000000 +0000
+++ binutils.current/bfd/elf.c	2005-12-23 14:04:18.000000000 +0000
@@ -1560,6 +1560,8 @@
   table->tls_size = 0;
   table->loaded = NULL;
   table->is_relocatable_executable = FALSE;
+  table->sorted = NULL;
+  table->sorted_size = 0;
 
   ret = _bfd_link_hash_table_init (&table->root, abfd, newfunc);
   table->root.type = bfd_link_elf_hash_table;
@@ -1588,6 +1590,15 @@
   return &ret->root;
 }
 
+void
+_bfd_elf_link_hash_table_free (struct bfd_link_hash_table *hash)
+{
+  struct elf_link_hash_table *table = (struct elf_link_hash_table *) hash;
+  if (table->sorted)
+    free (table->sorted);
+  _bfd_generic_link_hash_table_free (hash);
+}
+
 /* This is a hook for the ELF emulation code in the generic linker to
    tell the backend linker what file name to use for the DT_NEEDED
    entry for a dynamic object.  */
@@ -2983,7 +2994,7 @@
       _bfd_elf_strtab_addref (elf_shstrtab (abfd), t->strtab_hdr.sh_name);
     }
 
-  _bfd_elf_strtab_finalize (elf_shstrtab (abfd));
+  _bfd_elf_strtab_finalize (elf_shstrtab (abfd), 0);
   t->shstrtab_hdr.sh_size = _bfd_elf_strtab_size (elf_shstrtab (abfd));
 
   elf_numsections (abfd) = section_number;
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elflink.c binutils.current/bfd/elflink.c
--- binutils-2.16.91.0.4/bfd/elflink.c	2005-12-22 10:42:52.000000000 +0000
+++ binutils.current/bfd/elflink.c	2006-01-18 16:36:45.000000000 +0000
@@ -3007,7 +3007,8 @@
   const struct elf_backend_data *bed;
   bfd_byte *extdyn;
 
-  _bfd_elf_strtab_finalize (dynstr);
+  _bfd_elf_strtab_finalize (dynstr, info->dynsort ?
+			    elf_hash_table (info)->bucketcount : 0);
   size = _bfd_elf_strtab_size (dynstr);
 
   bed = get_elf_backend_data (dynobj);
@@ -4715,27 +4716,17 @@
       return FALSE;
     }
 }
-
-/* This function will be called though elf_link_hash_traverse to store
-   all hash value of the exported symbols in an array.  */
 
-static bfd_boolean
-elf_collect_hash_codes (struct elf_link_hash_entry *h, void *data)
+/*
+ * Compute the elf hash value of the name ignoring the version.
+ */
+unsigned long
+_bfd_elf_ver_hash (const char *name)
 {
-  unsigned long **valuep = data;
-  const char *name;
   char *p;
   unsigned long ha;
   char *alc = NULL;
 
-  if (h->root.type == bfd_link_hash_warning)
-    h = (struct elf_link_hash_entry *) h->root.u.i.link;
-
-  /* Ignore indirect symbols.  These are added by the versioning code.  */
-  if (h->dynindx == -1)
-    return TRUE;
-
-  name = h->root.root.string;
   p = strchr (name, ELF_VER_CHR);
   if (p != NULL)
     {
@@ -4748,6 +4739,31 @@
   /* Compute the hash value.  */
   ha = bfd_elf_hash (name);
 
+  if (alc != NULL)
+    free (alc);
+
+  return ha;
+}
+
+
+/* This function will be called though elf_link_hash_traverse to store
+   all hash value of the exported symbols in an array.  */
+
+static bfd_boolean
+elf_collect_hash_codes (struct elf_link_hash_entry *h, void *data)
+{
+  unsigned long **valuep = data;
+  unsigned long ha;
+
+  if (h->root.type == bfd_link_hash_warning)
+    h = (struct elf_link_hash_entry *) h->root.u.i.link;
+
+  /* Ignore indirect symbols.  These are added by the versioning code.  */
+  if (h->dynindx == -1)
+    return TRUE;
+
+  ha = _bfd_elf_ver_hash (h->root.root.string);
+
   /* Store the found hash value in the array given as the argument.  */
   *(*valuep)++ = ha;
 
@@ -4755,9 +4771,6 @@
      later.  */
   h->u.elf_hash_value = ha;
 
-  if (alc != NULL)
-    free (alc);
-
   return TRUE;
 }
 
@@ -4920,6 +4933,123 @@
   return best_size;
 }
 
+void _bfd_elf_link_hash_traverse
+  (struct elf_link_hash_table *table,
+   bfd_boolean (*func) (struct elf_link_hash_entry *, void *),
+   void *info)
+{
+  if (!table->sorted)
+    bfd_link_hash_traverse						\
+      (&(table)->root,							\
+       (bfd_boolean (*) (struct bfd_link_hash_entry *, void *)) (func),	\
+       (info));
+  else
+    {
+      unsigned int i;
+      for (i = 0; i < table->sorted_size; i++)
+        {
+          if (! func (table->sorted[i], info))
+	    return;
+	}
+    }
+}
+
+/* Sort by elf hash value % buckets  */
+static int
+elf_sort_dynsym_hash (const void *arg1, const void *arg2, const void *closure)
+{
+  size_t h1_bucket, h2_bucket;
+  const struct elf_link_hash_entry *h1;
+  const struct elf_link_hash_entry *h2;
+  const bfd_size_type *bucketcount;
+
+  h1 = *(const struct elf_link_hash_entry **) arg1;
+  h2 = *(const struct elf_link_hash_entry **) arg2;
+  bucketcount = closure;
+
+  h1_bucket = h1->u.elf_hash_value % *bucketcount;
+  h2_bucket = h2->u.elf_hash_value % *bucketcount;
+
+  if (h1_bucket > h2_bucket)
+    return 1;
+  if (h1_bucket < h2_bucket)
+    return -1;
+
+  return 0;
+}
+
+struct elf_dynsym_sort_info
+{
+  bfd_boolean  do_dynsym;
+  unsigned int alloc_size;
+  unsigned int sorted_size;
+  struct elf_link_hash_entry **sorted_syms;
+};
+
+/* collect sym entries into an array for later sorting */
+static bfd_boolean
+elf_sort_collect_dynsyms (struct elf_link_hash_entry *h, void *data)
+{
+  struct elf_dynsym_sort_info *sinfo = data;
+
+  if ((sinfo->do_dynsym && h->dynindx < 0) ||
+      (!sinfo->do_dynsym && h->dynindx >= 0))
+    return TRUE;
+
+  if (sinfo->sorted_size >= sinfo->alloc_size)
+    {
+      sinfo->alloc_size *= 2;
+      /* FIXME: need to free this data too ... */
+      sinfo->sorted_syms = bfd_realloc (sinfo->sorted_syms,
+					 sizeof (struct elf_link_hash_entry *) *
+					 sinfo->alloc_size);
+    }
+  sinfo->sorted_syms [sinfo->sorted_size++] = h;
+
+  return TRUE;
+}
+
+/*
+ * Sort the exported elf symbols by elf_hash % bucketcount to
+ * improve run-time linker cache behavior. Subsequent
+ * elf_link_hash_traverse calls will reflect this new order.
+ */
+static bfd_boolean
+_bfd_elf_sort_dynsyms (struct bfd_link_info *info)
+{
+  bfd_size_type bucketcount;
+  struct elf_dynsym_sort_info sinfo;
+
+  sinfo.alloc_size = 8;
+  sinfo.sorted_syms = bfd_malloc (sizeof (struct elf_link_hash_entry *) *
+				  sinfo.alloc_size);
+  if (!sinfo.sorted_syms)
+    return FALSE;
+
+  sinfo.sorted_size = 0;
+
+  /* append dynsyms for sorting */
+  sinfo.do_dynsym = TRUE;
+  elf_link_hash_traverse (elf_hash_table (info), elf_sort_collect_dynsyms, &sinfo);
+
+  /* sort them ... */
+  bucketcount = elf_hash_table (info)->bucketcount;
+  bfd_qsort (sinfo.sorted_syms, sinfo.sorted_size,
+	     sizeof (struct elf_link_hash_entry *),
+	     elf_sort_dynsym_hash,
+	     &bucketcount);
+
+  /* append everything else */
+  sinfo.do_dynsym = FALSE;
+  elf_link_hash_traverse (elf_hash_table (info), elf_sort_collect_dynsyms, &sinfo);
+
+  /* freed in _bfd_elf_link_hash_table_free */
+  elf_hash_table (info)->sorted = sinfo.sorted_syms;
+  elf_hash_table (info)->sorted_size = sinfo.sorted_size;
+
+  return TRUE;
+}
+
 /* Set up the sizes and contents of the ELF dynamic sections.  This is
    called by the ELF linker emulation before_allocation routine.  We
    must set the sizes of the sections before the linker sets the
@@ -5686,6 +5816,7 @@
 	 section symbol for each output section, which come first.
 	 Next come all of the back-end allocated local dynamic syms,
 	 followed by the rest of the global symbols.  */
+      /* To sort these optimally we need the correct bucketcount */
 
       dynsymcount = _bfd_elf_link_renumber_dynsyms (output_bfd, info,
 						    &section_sym_count);
@@ -5756,6 +5887,17 @@
       for (dtagcount = 0; dtagcount <= info->spare_dynamic_tags; ++dtagcount)
 	if (!_bfd_elf_add_dynamic_entry (info, DT_NULL, 0))
 	  return FALSE;
+      
+      /* Sort .dynsym to accelerate runtime linking */
+      if (info->dynsort)
+      {
+        if (!_bfd_elf_sort_dynsyms (info))
+          return FALSE;
+
+	/* renumber to reflect the new sorting order */
+	_bfd_elf_link_renumber_dynsyms (output_bfd, info,
+					&section_sym_count);
+      }
     }
 
   return TRUE;
@@ -5892,6 +6034,7 @@
     bfd_vma sym_mask;
   } u;
   enum elf_reloc_type_class type;
+  unsigned long elf_bucket;
   /* We use this as an array of size int_rels_per_ext_rel.  */
   Elf_Internal_Rela rela[1];
 };
@@ -5928,6 +6071,10 @@
   const struct elf_link_sort_rela *b = B;
   int copya, copyb;
 
+  if (a->elf_bucket < b->elf_bucket)
+    return -1;
+  if (a->elf_bucket > b->elf_bucket)
+    return 1;
   if (a->u.offset < b->u.offset)
     return -1;
   if (a->u.offset > b->u.offset)
@@ -5946,8 +6093,9 @@
 }
 
 static size_t
-elf_link_sort_relocs (bfd *abfd, struct bfd_link_info *info, asection **psec)
+elf_link_sort_relocs (bfd *abfd, struct elf_final_link_info *finfo, asection **psec)
 {
+  struct bfd_link_info *info = finfo->info;
   asection *reldyn;
   bfd_size_type count, size;
   size_t i, ret, sort_elt, ext_size;
@@ -5959,6 +6107,7 @@
   void (*swap_out) (bfd *, const Elf_Internal_Rela *, bfd_byte *);
   struct bfd_link_order *lo;
   bfd_vma r_sym_mask;
+  int r_sym_shift;
 
   reldyn = bfd_get_section_by_name (abfd, ".rela.dyn");
   if (reldyn == NULL || reldyn->size == 0)
@@ -6000,15 +6149,29 @@
     }
 
   if (bed->s->arch_size == 32)
-    r_sym_mask = ~(bfd_vma) 0xff;
+    {
+      r_sym_mask = ~(bfd_vma) 0xff;
+      r_sym_shift = 8;
+    }
   else
-    r_sym_mask = ~(bfd_vma) 0xffffffff;
+    {
+      r_sym_mask = ~(bfd_vma) 0xffffffff;
+      r_sym_shift = 32;
+    }
 
   for (lo = reldyn->map_head.link_order; lo != NULL; lo = lo->next)
     if (lo->type == bfd_indirect_link_order)
       {
 	bfd_byte *erel, *erelend;
 	asection *o = lo->u.indirect.section;
+	int base_offset = -1;
+	int base_max = 0;
+
+	if (elf_hash_table (info)->sorted_size > 0)
+	  {
+	    base_offset = elf_hash_table (info)->sorted[0]->dynindx;
+	    base_max = base_offset + elf_hash_table (info)->sorted_size;
+	  }
 
 	if (o->contents == NULL && o->size != 0)
 	  {
@@ -6023,10 +6186,24 @@
 	p = sort + o->output_offset / ext_size * sort_elt;
 	while (erel < erelend)
 	  {
+	    long dyn_idx;
+	    size_t bucketcount = elf_hash_table (info)->bucketcount;
 	    struct elf_link_sort_rela *s = (struct elf_link_sort_rela *) p;
 	    (*swap_in) (abfd, erel, s->rela);
 	    s->type = (*bed->elf_backend_reloc_type_class) (s->rela);
 	    s->u.sym_mask = r_sym_mask;
+	    
+	    if (s->type != reloc_class_relative)
+	      dyn_idx = s->rela->r_info >> r_sym_shift;
+	    else
+	      dyn_idx = -1;
+
+	    if (info->dynsort && base_offset >= 0 &&
+		dyn_idx < base_max && dyn_idx >= base_offset)
+	      s->elf_bucket = elf_hash_table (info)->sorted [dyn_idx - base_offset]->u.elf_hash_value % bucketcount;
+	    else
+	      s->elf_bucket = 0;
+				       
 	    p += sort_elt;
 	    erel += ext_size;
 	  }
@@ -8426,7 +8612,7 @@
     }
 
   if (dynamic && info->combreloc && dynobj != NULL)
-    relativecount = elf_link_sort_relocs (abfd, info, &reldyn);
+    relativecount = elf_link_sort_relocs (abfd, &finfo, &reldyn);
 
   /* If we are linking against a dynamic object, or generating a
      shared library, finish up the dynamic linking information.  */
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf-m10300.c binutils.current/bfd/elf-m10300.c
--- binutils-2.16.91.0.4/bfd/elf-m10300.c	2005-11-13 17:16:34.000000000 +0000
+++ binutils.current/bfd/elf-m10300.c	2005-12-23 14:06:10.000000000 +0000
@@ -3729,8 +3729,7 @@
 
   _bfd_generic_link_hash_table_free
     ((struct bfd_link_hash_table *) ret->static_hash_table);
-  _bfd_generic_link_hash_table_free
-    ((struct bfd_link_hash_table *) ret);
+  _bfd_elf_link_hash_table_free (hash);
 }
 
 static unsigned long
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elf-strtab.c binutils.current/bfd/elf-strtab.c
--- binutils-2.16.91.0.4/bfd/elf-strtab.c	2005-05-10 23:46:41.000000000 +0100
+++ binutils.current/bfd/elf-strtab.c	2006-01-18 16:37:04.000000000 +0000
@@ -39,6 +39,7 @@
     /* Entry this is a suffix of (if len < 0).  */
     struct elf_strtab_hash_entry *suffix;
   } u;
+  long hash_bucket;
 };
 
 /* The strtab hash table.  */
@@ -54,6 +55,8 @@
   bfd_size_type sec_size;
   /* Array of pointers to strtab entries.  */
   struct elf_strtab_hash_entry **array;
+  /* Array of pointers to strtab entries.  */
+  struct elf_strtab_hash_entry **array_sorted;
 };
 
 /* Routine to create an entry in a section merge hashtab.  */
@@ -117,6 +120,7 @@
     }
 
   table->array[0] = NULL;
+  table->array_sorted = NULL;
 
   return table;
 }
@@ -128,6 +132,8 @@
 {
   bfd_hash_table_free (&tab->table);
   free (tab->array);
+  if (tab->array_sorted)
+    free (tab->array_sorted);
   free (tab);
 }
 
@@ -229,6 +235,12 @@
 _bfd_elf_strtab_emit (register bfd *abfd, struct elf_strtab_hash *tab)
 {
   bfd_size_type off = 1, i;
+  struct elf_strtab_hash_entry **array;
+
+  if (tab->array_sorted != NULL)
+    array = tab->array_sorted;
+  else
+    array = tab->array;
 
   if (bfd_bwrite ("", 1, abfd) != 1)
     return FALSE;
@@ -238,12 +250,12 @@
       register const char *str;
       register unsigned int len;
 
-      BFD_ASSERT (tab->array[i]->refcount == 0);
-      len = tab->array[i]->len;
+      BFD_ASSERT (array[i]->refcount == 0);
+      len = array[i]->len;
       if ((int) len < 0)
 	continue;
 
-      str = tab->array[i]->root.string;
+      str = array[i]->root.string;
       if (bfd_bwrite (str, len, abfd) != len)
 	return FALSE;
 
@@ -278,6 +290,26 @@
   return lenA - lenB;
 }
 
+/* sort by hash bucket position */
+static int
+hash_compare (const void *a, const void *b)
+{
+  struct elf_strtab_hash_entry *A = *(struct elf_strtab_hash_entry **) a;
+  struct elf_strtab_hash_entry *B = *(struct elf_strtab_hash_entry **) b;
+
+  if (A->hash_bucket > B->hash_bucket)
+    return 1;
+  if (A->hash_bucket < B->hash_bucket)
+    return -1;
+
+  /* Make qsort faster for lots of identical empty symbols */
+  if (a > b)
+    return 1;
+  if (a < b)
+    return -1;
+  return 0;
+}
+
 static inline int
 is_suffix (const struct elf_strtab_hash_entry *A,
 	   const struct elf_strtab_hash_entry *B)
@@ -293,9 +325,8 @@
 
 /* This function assigns final string table offsets for used strings,
    merging strings matching suffixes of longer strings if possible.  */
-
 void
-_bfd_elf_strtab_finalize (struct elf_strtab_hash *tab)
+_bfd_elf_strtab_finalize (struct elf_strtab_hash *tab, size_t bucket_count)
 {
   struct elf_strtab_hash_entry **array, **a, *e;
   bfd_size_type size, amt;
@@ -361,15 +392,37 @@
 	}
     }
 
-alloc_failure:
-  if (array)
-    free (array);
+  if (bucket_count != 0)
+    {
+      array[0] = NULL;
+      for (i = 1; i < tab->size; ++i)
+	{
+	  e = tab->array[i];
+	  array[i] = e;
+
+	  if (e->len > 0)
+	    {
+	      e->hash_bucket = _bfd_elf_ver_hash (e->root.string);
+	      e->hash_bucket %= bucket_count;
+	    }
+	  else
+	    e->hash_bucket = 0;
+	}
+      qsort (array + 1, tab->size - 1, sizeof (struct elf_strtab_hash_entry *), hash_compare);
+      tab->array_sorted = array;
+    }
+  else
+    {
+      free (array);
+ alloc_failure:
+      array = tab->array;
+    }
 
   /* Assign positions to the strings we want to keep.  */
   size = 1;
   for (i = 1; i < tab->size; ++i)
     {
-      e = tab->array[i];
+      e = array[i];
       if (e->refcount && e->len > 0)
 	{
 	  e->u.index = size;
@@ -382,7 +435,7 @@
   /* Adjust the rest.  */
   for (i = 1; i < tab->size; ++i)
     {
-      e = tab->array[i];
+      e = array[i];
       if (e->refcount && e->len < 0)
 	e->u.index = e->u.suffix->u.index + (e->u.suffix->len + e->len);
     }
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/bfd/elfxx-target.h binutils.current/bfd/elfxx-target.h
--- binutils-2.16.91.0.4/bfd/elfxx-target.h	2005-08-22 20:27:41.000000000 +0100
+++ binutils.current/bfd/elfxx-target.h	2005-12-23 14:06:51.000000000 +0000
@@ -205,7 +205,7 @@
 #endif
 
 #ifndef bfd_elfNN_bfd_link_hash_table_free
-#define bfd_elfNN_bfd_link_hash_table_free _bfd_generic_link_hash_table_free
+#define bfd_elfNN_bfd_link_hash_table_free _bfd_elf_link_hash_table_free
 #endif
 
 #ifdef elf_backend_relocate_section
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/include/bfdlink.h binutils.current/include/bfdlink.h
--- binutils-2.16.91.0.4/include/bfdlink.h	2005-12-22 10:42:52.000000000 +0000
+++ binutils.current/include/bfdlink.h	2006-01-18 16:19:19.000000000 +0000
@@ -327,6 +327,9 @@
   /* TRUE if unreferenced sections should be removed.  */
   unsigned int gc_sections: 1;
 
+  /* TRUE if dynsym/dynstr/relocs should be sorted */
+  unsigned int dynsort : 1;
+
   /* What to do with unresolved symbols in an object file.
      When producing executables the default is GENERATE_ERROR.
      When producing shared libraries the default is IGNORE.  The
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/ld/eelf_i386.c binutils.current/ld/eelf_i386.c
--- binutils-2.16.91.0.4/ld/eelf_i386.c	2005-12-23 17:21:49.000000000 +0000
+++ binutils.current/ld/eelf_i386.c	2006-01-18 17:21:10.000000000 +0000
@@ -3949,6 +3949,10 @@
 	link_info.relro = TRUE;
       else if (strcmp (optarg, "norelro") == 0)
 	link_info.relro = FALSE;
+      else if (strcmp (optarg, "dynsort") == 0)
+	link_info.dynsort = TRUE;
+      else if (strcmp (optarg, "nodynsort") == 0)
+	link_info.dynsort = FALSE;
       /* What about the other Solaris -z options? FIXME.  */
       break;
     }
@@ -3966,6 +3970,7 @@
   fprintf (file, _("  --eh-frame-hdr\tCreate .eh_frame_hdr section\n"));
   fprintf (file, _("  -z combreloc\t\tMerge dynamic relocs into one section and sort\n"));
   fprintf (file, _("  -z defs\t\tReport unresolved symbols in object files.\n"));
+  fprintf (file, _("  -z dynsort\t\tSort dynamic link sections\n"));
   fprintf (file, _("  -z execstack\t\tMark executable as requiring executable stack\n"));
   fprintf (file, _("  -z initfirst\t\tMark DSO to be initialized first at runtime\n"));
   fprintf (file, _("  -z interpose\t\tMark object to interpose all DSOs but executable\n"));
@@ -3977,6 +3982,7 @@
   fprintf (file, _("  -z nodelete\t\tMark DSO non-deletable at runtime\n"));
   fprintf (file, _("  -z nodlopen\t\tMark DSO not available to dlopen\n"));
   fprintf (file, _("  -z nodump\t\tMark DSO not available to dldump\n"));
+  fprintf (file, _("  -z nodynsort\t\tDon't sort dynamic link sections\n"));
   fprintf (file, _("  -z noexecstack\tMark executable as not requiring executable stack\n"));
   fprintf (file, _("  -z norelro\t\tDon't create RELRO program header\n"));
   fprintf (file, _("  -z now\t\tMark object non-lazy runtime binding\n"));
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/ld/ldmain.c binutils.current/ld/ldmain.c
--- binutils-2.16.91.0.4/ld/ldmain.c	2005-12-07 10:42:37.000000000 +0000
+++ binutils.current/ld/ldmain.c	2006-01-18 16:19:56.000000000 +0000
@@ -316,6 +316,7 @@
   link_info.need_relax_finalize = FALSE;
   link_info.warn_shared_textrel = FALSE;
   link_info.gc_sections = FALSE;
+  link_info.dynsort = FALSE;
 
   ldfile_add_arch ("");
 
diff -u -r -x texis -x Makeconfig -x version.h -x '*.o' -x '*.1' -x 'Makefile*' -x 'config*' -x libtool -x '*.info' -x '*.tex' binutils-2.16.91.0.4/ld/ld.texinfo binutils.current/ld/ld.texinfo
--- binutils-2.16.91.0.4/ld/ld.texinfo	2005-12-07 10:42:37.000000000 +0000
+++ binutils.current/ld/ld.texinfo	2006-01-18 16:29:15.000000000 +0000
@@ -928,6 +928,9 @@
 Disallows undefined symbols in object files.  Undefined symbols in
 shared libraries are still allowed.
 
+@item dynsort
+Sorts dynamic link sections, to reduce cache misses during linking.
+
 @item execstack
 Marks the object as requiring executable stack.
 
@@ -969,7 +972,10 @@
 @item nodump
 Marks the object can not be dumped by @code{dldump}.
 
+@item nodynsort
+Disables dynamic link section sorting.
+
 @item noexecstack
 Marks the object as not requiring executable stack.
 


New module: bfdsort.c:

--- binutils-2.16.91.0.4/bfd/Makefile.am	2005-12-07 10:42:37.000000000 +0000
+++ binutils.current/bfd/Makefile.am	2006-01-18 15:50:23.000000000 +0000
@@ -32,7 +32,7 @@
 # need two copies of the executable, one to download and one for the
 # debugger).
 BFD32_LIBS = \
-	archive.lo archures.lo bfd.lo bfdio.lo bfdwin.lo \
+	archive.lo archures.lo bfd.lo bfdio.lo bfdsort.lo bfdwin.lo \
 	cache.lo coffgen.lo corefile.lo \
 	format.lo init.lo libbfd.lo opncls.lo reloc.lo \
 	section.lo syms.lo targets.lo hash.lo linker.lo \
@@ -42,7 +42,7 @@
 BFD64_LIBS = archive64.lo
 
 BFD32_LIBS_CFILES = \
-	archive.c archures.c bfd.c bfdio.c bfdwin.c \
+	archive.c archures.c bfd.c bfdio.c bfdsort.c bfdwin.c \
 	cache.c coffgen.c corefile.c \
 	format.c init.c libbfd.c opncls.c reloc.c \
 	section.c syms.c targets.c hash.c linker.c \

--- /dev/null	2006-01-18 11:12:51.092369250 +0000
+++ binutils.current/bfd/bfdsort.c	2006-01-18 15:55:17.000000000 +0000
@@ -0,0 +1,253 @@
+/*
+ * This module copied from glibc/stdlib/qsort.c & adapted,
+ * in order to add support for a closure argument.
+ */
+
+/* Copyright (C) 1991,1992,1996,1997,1999,2004 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Written by Douglas C. Schmidt (schmidt@ics.uci.edu).
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+/* If you consider tuning this algorithm, you should consult first:
+   Engineering a sort function; Jon Bentley and M. Douglas McIlroy;
+   Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993.  */
+
+#include "bfd.h"
+#include "sysdep.h"
+
+/* Byte-wise swap two items of size SIZE. */
+#define SWAP(a, b, size)						      \
+  do									      \
+    {									      \
+      register bfd_size_type __size = (size);					      \
+      register char *__a = (a), *__b = (b);				      \
+      do								      \
+	{								      \
+	  char __tmp = *__a;						      \
+	  *__a++ = *__b;						      \
+	  *__b++ = __tmp;						      \
+	} while (--__size > 0);						      \
+    } while (0)
+
+/* Discontinue quicksort algorithm when partition gets below this size.
+   This particular magic number was chosen to work best on a Sun 4/260. */
+#define MAX_THRESH 4
+
+/* Stack node declarations used to store unfulfilled partition obligations. */
+typedef struct
+  {
+    char *lo;
+    char *hi;
+  } stack_node;
+
+/* The next 4 #defines implement a very fast in-line stack abstraction. */
+/* The stack needs log (total_elements) entries (we could even subtract
+   log(MAX_THRESH)).  Since total_elements has type size_t, we get as
+   upper bound for log (total_elements):
+   bits per byte (CHAR_BIT) * sizeof(size_t).  */
+#define STACK_SIZE	(8 * sizeof(bfd_size_type))
+#define PUSH(low, high)	do { top->lo = (low); top->hi = (high); ++top; } while (0)
+#define	POP(low, high)	do { --top; low = top->lo; high = top->hi; } while (0)
+#define	STACK_NOT_EMPTY	(stack < top)
+
+
+/* Order size using quicksort.  This implementation incorporates
+   four optimizations discussed in Sedgewick:
+
+   1. Non-recursive, using an explicit stack of pointer that store the
+      next array partition to sort.  To save time, this maximum amount
+      of space required to store an array of SIZE_MAX is allocated on the
+      stack.  Assuming a 32-bit (64 bit) integer for size_t, this needs
+      only 32 * sizeof(stack_node) == 256 bytes (for 64 bit: 1024 bytes).
+      Pretty cheap, actually.
+
+   2. Chose the pivot element using a median-of-three decision tree.
+      This reduces the probability of selecting a bad pivot value and
+      eliminates certain extraneous comparisons.
+
+   3. Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving
+      insertion sort to order the MAX_THRESH items within each partition.
+      This is a big win, since insertion sort is faster for small, mostly
+      sorted array segments.
+
+   4. The larger of the two sub-partitions is always pushed onto the
+      stack first, with the algorithm then concentrating on the
+      smaller partition.  This *guarantees* no more than log (total_elems)
+      stack size is needed (actually O(1) in this case)!  */
+
+void
+bfd_qsort (void *pbase, bfd_size_type total_elems, bfd_size_type size,
+	   bfd_qsort_closure_func cmp, void *closure)
+{
+  register char *base_ptr = (char *) pbase;
+
+  const bfd_size_type max_thresh = MAX_THRESH * size;
+
+  if (total_elems == 0)
+    /* Avoid lossage with unsigned arithmetic below.  */
+    return;
+
+  if (total_elems > MAX_THRESH)
+    {
+      char *lo = base_ptr;
+      char *hi = &lo[size * (total_elems - 1)];
+      stack_node stack[STACK_SIZE];
+      stack_node *top = stack;
+
+      PUSH (NULL, NULL);
+
+      while (STACK_NOT_EMPTY)
+        {
+          char *left_ptr;
+          char *right_ptr;
+
+	  /* Select median value from among LO, MID, and HI. Rearrange
+	     LO and HI so the three values are sorted. This lowers the
+	     probability of picking a pathological pivot value and
+	     skips a comparison for both the LEFT_PTR and RIGHT_PTR in
+	     the while loops. */
+
+	  char *mid = lo + size * ((hi - lo) / size >> 1);
+
+	  if ((*cmp) ((void *) mid, (void *) lo, closure) < 0)
+	    SWAP (mid, lo, size);
+	  if ((*cmp) ((void *) hi, (void *) mid, closure) < 0)
+	    SWAP (mid, hi, size);
+	  else
+	    goto jump_over;
+	  if ((*cmp) ((void *) mid, (void *) lo, closure) < 0)
+	    SWAP (mid, lo, size);
+	jump_over:;
+
+	  left_ptr  = lo + size;
+	  right_ptr = hi - size;
+
+	  /* Here's the famous ``collapse the walls'' section of quicksort.
+	     Gotta like those tight inner loops!  They are the main reason
+	     that this algorithm runs much faster than others. */
+	  do
+	    {
+	      while ((*cmp) ((void *) left_ptr, (void *) mid, closure) < 0)
+		left_ptr += size;
+
+	      while ((*cmp) ((void *) mid, (void *) right_ptr, closure) < 0)
+		right_ptr -= size;
+
+	      if (left_ptr < right_ptr)
+		{
+		  SWAP (left_ptr, right_ptr, size);
+		  if (mid == left_ptr)
+		    mid = right_ptr;
+		  else if (mid == right_ptr)
+		    mid = left_ptr;
+		  left_ptr += size;
+		  right_ptr -= size;
+		}
+	      else if (left_ptr == right_ptr)
+		{
+		  left_ptr += size;
+		  right_ptr -= size;
+		  break;
+		}
+	    }
+	  while (left_ptr <= right_ptr);
+
+          /* Set up pointers for next iteration.  First determine whether
+             left and right partitions are below the threshold size.  If so,
+             ignore one or both.  Otherwise, push the larger partition's
+             bounds on the stack and continue sorting the smaller one. */
+
+          if ((bfd_size_type) (right_ptr - lo) <= max_thresh)
+            {
+              if ((bfd_size_type) (hi - left_ptr) <= max_thresh)
+		/* Ignore both small partitions. */
+                POP (lo, hi);
+              else
+		/* Ignore small left partition. */
+                lo = left_ptr;
+            }
+          else if ((bfd_size_type) (hi - left_ptr) <= max_thresh)
+	    /* Ignore small right partition. */
+            hi = right_ptr;
+          else if ((right_ptr - lo) > (hi - left_ptr))
+            {
+	      /* Push larger left partition indices. */
+              PUSH (lo, right_ptr);
+              lo = left_ptr;
+            }
+          else
+            {
+	      /* Push larger right partition indices. */
+              PUSH (left_ptr, hi);
+              hi = right_ptr;
+            }
+        }
+    }
+
+  /* Once the BASE_PTR array is partially sorted by quicksort the rest
+     is completely sorted using insertion sort, since this is efficient
+     for partitions below MAX_THRESH size. BASE_PTR points to the beginning
+     of the array to sort, and END_PTR points at the very last element in
+     the array (*not* one beyond it!). */
+
+#define min(x, y) ((x) < (y) ? (x) : (y))
+
+  {
+    char *const end_ptr = &base_ptr[size * (total_elems - 1)];
+    char *tmp_ptr = base_ptr;
+    char *thresh = min(end_ptr, base_ptr + max_thresh);
+    register char *run_ptr;
+
+    /* Find smallest element in first threshold and place it at the
+       array's beginning.  This is the smallest array element,
+       and the operation speeds up insertion sort's inner loop. */
+
+    for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size)
+      if ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, closure) < 0)
+        tmp_ptr = run_ptr;
+
+    if (tmp_ptr != base_ptr)
+      SWAP (tmp_ptr, base_ptr, size);
+
+    /* Insertion sort, running from left-hand-side up to right-hand-side.  */
+
+    run_ptr = base_ptr + size;
+    while ((run_ptr += size) <= end_ptr)
+      {
+	tmp_ptr = run_ptr - size;
+	while ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, closure) < 0)
+	  tmp_ptr -= size;
+
+	tmp_ptr += size;
+        if (tmp_ptr != run_ptr)
+          {
+            char *trav;
+
+	    trav = run_ptr + size;
+	    while (--trav >= run_ptr)
+              {
+                char c = *trav;
+                char *hi, *lo;
+
+                for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo)
+                  *hi = *lo;
+                *hi = c;
+              }
+          }
+      }
+  }
+}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]