This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction

From: Andreas Jaeger <aj at suse dot com>
To: ling dot ma dot program at gmail dot com
Cc: libc-alpha at sourceware dot org, neleai at seznam dot cz, liubov dot dmitrieva at gmail dot com, Ma Ling <ling dot ml at alibaba-inc dot com>
Date: Mon, 29 Jul 2013 11:52:55 +0200
Subject: Re: [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction
References: <1375090855-8312-1-git-send-email-ling dot ma dot program at gmail dot com>

On 07/29/2013 11:40 AM, ling.ma.program@gmail.com wrote:
> From: Ma Ling <ling.ml@alibaba-inc.com>
> 
> We manage to avoid branch instructions, and force destination to be aligned
> with avx instruction, then modified gcc.403 so that we can only measure memcpy function, 
> gcc.403 benchmarks indicate the version improved performance from 4% to 14%
> cmpaired with memcpy_sse2_unaligned on haswell machine.
> 
> case	avx_unaligned	sse2_unaligned	AVX vs SSE2
> 200i	146833745		168384142	1.146767332
> g23		1431207341		1557405243	1.088175835
> 166i	350901531		379068674	1.08027079
> cp-decl	370750774		395890196	1.067806796
> c-type	763780824		810806468	1.061569553
> expr2	986698539		1067232192	1.081619309
> expr	727016829		758953883	1.043928906
> s04		1117900758		1185159528	1.060165242
> scilab	63309111		66893431	1.05661618
> (We will send test patch on memcpy for above cases)

Is memcpy_sse2_unaligned really the right function to compare with?
Isn't __memcpy_ssse3 used on Haswell today?

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
    GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126

Follow-Ups:
- Re: [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction
  - From: OndÅej BÃlka
- Re: [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction
  - From: Ling Ma

References:
- [PATCH RFC V4] Improve 64bit memcpy/memove for Corei7 with unaligned avx instruction
  - From: ling . ma . program

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]