This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] BZ #14059 - HAS_FMA4 check needs to also check for AVX


On 05/11/2012 09:10 AM, Allan McRae wrote:
On 11/05/12 16:48, Andreas Jaeger wrote:
On 05/11/2012 08:34 AM, Andreas Jaeger wrote:
On 05/11/2012 12:18 AM, Carlos O'Donell wrote:
On Thu, May 10, 2012 at 6:08 PM, Carlos O'Donell
<carlos@systemhalted.org>    wrote:
On Tue, May 8, 2012 at 4:23 AM, Andreas Jaeger<aj@suse.de> wrote:

There are two things broken in checking for HAS_FMA4/AVX on x86-64 in the multiarch code:

1. We should disable FAM4 if AVX is not available (due to disabled
     OSXSAVE)

2. commit 56f6f6a2403cfa7267cad722597113be35ecf70d reverted some changes
     from commit 08cf777f9e7f6d826658a99c7d77a359f73a45bf but forgot to
     revert the change for AVX. We really have to disable AVX if it's

The issue is easy reproduceable under Xen for the reporter.

Also, the two commits introduced YMM_Usable and removed its usage but
did not remove the code from the headers, I'm cleaning this up as well.

I'm appending a patch that I tested on Linux/x86-64 (without Xen).

Ok to commit?

No.


I actually think the code Ulrich wrote originally was almost correct,
but missing this:

If I had to fix it it would look like this:

Thanks, this looks great - but misses the check for FMA4.


Here's a combined patch that compiled fine for me on
Linux/x86-64. Jeff, could you test it under Xen, please?

Andreas

2012-05-11  Andreas Jaeger<aj@suse.de>
	Carlos O'Donell<carlos_odonell@mentor.com>

	[BZ #14059]
	* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
	Fix check for AVX, enable FMA4 only if it exists and if AVX is
	usable.

	* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Usable): Renamed
	from bit_YMM_usable.
	(bit_FMA4_Usable): New macro.
	(bit_XMM_state): New macro.
	(bit_YMM_state): New macro.
	(index_AVX_Usable): Renamed from index_YMM_Usable.
	(index_FMA4_Usable): New macro.
	(HAS_YMM_USABLE): Delete.
	(HAS_AVX): Use HAS_ARCH_FEATURE.
	(HAS_FMA4): Likewise.

That version above contained a typo. Here's an update


Allan, does this work now?


No. I still get an illegal instruction in __strcasecmp_l_avx with this version.

Allan, thanks for testing!


Indeed, it fails:

The code tests the bits in cpuid and does not use the HAS_AVX flag:
#  ifdef HAVE_AVX_SUPPORT
        leaq    __strcasecmp_avx(%rip), %rax
        testl   $bit_AVX, __cpu_features+CPUID_OFFSET+index_AVX(%rip)
        jnz     2f
#  endif

Please test the following.

I'm now offline until Monday, so won't be able to continue this, if
anybody likes to finish and commit the patch, I would appreciate it,

Andreas

2012-05-11 Andreas Jaeger <aj@suse.de>

* sysdeps/x86_64/multiarch/strcmp.S: Use bit_AVX_Usable.


diff --git a/sysdeps/x86_64/multiarch/strcmp.S b/sysdeps/x86_64/multiarch/strcmp.S
index 2b9870b..eaf852b 100644
--- a/sysdeps/x86_64/multiarch/strcmp.S
+++ b/sysdeps/x86_64/multiarch/strcmp.S
@@ -1,5 +1,5 @@
/* strcmp with SSE4.2
- Copyright (C) 2009, 2010, 2011 Free Software Foundation, Inc.
+ Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
Contributed by Intel Corporation.
This file is part of the GNU C Library.


@@ -106,7 +106,7 @@ ENTRY(__strcasecmp)
1:
# ifdef HAVE_AVX_SUPPORT
leaq __strcasecmp_avx(%rip), %rax
- testl $bit_AVX, __cpu_features+CPUID_OFFSET+index_AVX(%rip)
+ testl $bit_AVX_Usable, __cpu_features+FEATURE_OFFSET+index_AVX_Usable(%rip)
jnz 2f
# endif
leaq __strcasecmp_sse42(%rip), %rax
@@ -129,7 +129,7 @@ ENTRY(__strncasecmp)
1:
# ifdef HAVE_AVX_SUPPORT
leaq __strncasecmp_avx(%rip), %rax
- testl $bit_AVX, __cpu_features+CPUID_OFFSET+index_AVX(%rip)
+ testl $bit_AVX_Usable, __cpu_features+FEATURE_OFFSET+index_AVX_Usable(%rip)
jnz 2f
# endif
leaq __strncasecmp_sse42(%rip), %rax


--
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
    GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]