This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.

From: Will Newton <will dot newton at linaro dot org>
To: "Joseph S. Myers" <joseph at codesourcery dot com>
Cc: "libc-ports at sourceware dot org" <libc-ports at sourceware dot org>, Patch Tracking <patches at linaro dot org>
Date: Fri, 30 Aug 2013 15:56:35 +0100
Subject: Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
Authentication-results: sourceware.org; auth=none
References: <520894D5 dot 7060207 at linaro dot org> <Pine dot LNX dot 4 dot 64 dot 1308292353450 dot 1487 at digraph dot polyomino dot org dot uk>

On 30 August 2013 00:58, Joseph S. Myers <joseph@codesourcery.com> wrote:

Hi Joseph,

>> A small change to the entry to the aligned copy loop improves
>> performance slightly on A9 and A15 cores for certain copies.
>
> Could you clarify what you mean by "certain copies"?

Large copies (> 16kB) where the buffers are 4-byte aligned but not
8-byte aligned. I'll respin the patch with an improved description.

> In particular, have you verified that for all three choices in this code
> (NEON, VFP or neither), the code for unaligned copies is at least as fast
> in this case (common 32-bit alignment, but not common 64-bit alignment) as
> the code that would previously have been used in those cases?

Yes, the performance is very similar but slightly better in the NEON
case and approximately unchanged in the others.

> There are various comments regarding alignment, whether stating "LDRD/STRD
> support unaligned word accesses" or referring to the mutual alignment that
> applies for particular code.  Does this patch make any of them out of
> date?  (If code can now only be reached with common 64-bit alignment, but
> in fact requires only 32-bit alignment, the comment should probably state
> both those things explicitly.)

I've reviewed the comments and they all look ok as far as I can tell.

-- 
Will Newton
Toolchain Working Group, Linaro

Follow-Ups:
- Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
  - From: Joseph S. Myers

References:
- [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
  - From: Will Newton
- Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
  - From: Joseph S. Myers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]