This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PING] [PATCH] faster string operations for bulldozer (take 2)
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: Carlos O'Donell <carlos at systemhalted dot org>, libc-alpha at sourceware dot org
- Date: Sat, 27 Apr 2013 03:12:34 +0200
- Subject: Re: [PING] [PATCH] faster string operations for bulldozer (take 2)
- References: <20120926171541 dot GA12300 at domone dot kolej dot mff dot cuni dot cz> <20120926172758 dot 56BEC2C097 at topped-with-meat dot com> <20120926184013 dot GA13454 at domone dot kolej dot mff dot cuni dot cz> <20120926194423 dot 15F9B2C061 at topped-with-meat dot com> <20120926211433 dot GA17771 at domone dot kolej dot mff dot cuni dot cz> <CAE2sS1i2nfrn58PNwtOXYx9qt=bWX_C4_fJN=CnGuefBTvN-Bw at mail dot gmail dot com> <20120930103730 dot GA5682 at domone dot kolej dot mff dot cuni dot cz> <20130426165614 dot GA16694 at domone dot kolej dot mff dot cuni dot cz> <20130426192321 dot 720E02C061 at topped-with-meat dot com>
On Fri, Apr 26, 2013 at 12:23:21PM -0700, Roland McGrath wrote:
> > > + /* Assume unaligned loads are fast when avx is available. */
>
> AVX in caps.
>
> > > + if ((ecx & bit_AVX) != 0)
> > > + __cpu_features.feature[index_Fast_Rep_String]
> > > + |= ( bit_Fast_Unaligned_Load);
>
> Drop the excess parens (and the excess space).
>
> I didn't follow whatever previous discussion there was about the substance
> of this. What is the rationale/evidence that AVX is (and always will be)
> correlated with efficiency of unaligned loads?
On bulldozer unaligned loads have small penalty. You could read about
buldozer improvements or test it by simple benchmark.
And what evidence that avx on intel processors is always
correlated with efficiency of unaligned loads?