This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] faster string operations for buldozer.


Hello, when I added fx10 in my benchmarks I noticed that in 
strlen etc.  SSE4_2 variants are selected. 

Difference between SSE4_2 and pminub variants is even bigger 
than on ivy bridge as speed of pminub is almost identical while
stricmpi is 40% slower on fx10 than on ivy bridge.

---
 ChangeLog                            |    5 +++++
 sysdeps/x86_64/multiarch/init-arch.c |    3 +++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 123f339..5277afb 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2012-09-26  Ondrej Bilka <neleai@seznam.cz>
+
+ * sysdeps/x86_64/multiarch/init_arch.c: Select faster string function
+   implementation for buldozer.
+
 2012-09-26  Marek Polacek  <polacek@redhat.com>

  [BZ #14530]
diff --git a/sysdeps/x86_64/multiarch/init-arch.c
b/sysdeps/x86_64/multiarch/init-arch.c
index fb44dcf..b872e5f 100644
--- a/sysdeps/x86_64/multiarch/init-arch.c
+++ b/sysdeps/x86_64/multiarch/init-arch.c
@@ -131,6 +131,9 @@ __init_cpu_features (void)
  __cpu_features.feature[index_Prefer_SSE_for_memop]
    |= bit_Prefer_SSE_for_memop;

+  __cpu_features.feature[index_Fast_Rep_String]
+   |= ( bit_Prefer_PMINUB_for_stringop);
+
       unsigned int eax;
       __cpuid (0x80000000, eax, ebx, ecx, edx);
       if (eax >= 0x80000001)
-- 
1.7.4.4




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]