This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
- From: Will Newton <will dot newton at linaro dot org>
- To: "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: "libc-ports at sourceware dot org" <libc-ports at sourceware dot org>, Patch Tracking <patches at linaro dot org>
- Date: Fri, 30 Aug 2013 15:56:35 +0100
- Subject: Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
- Authentication-results: sourceware.org; auth=none
- References: <520894D5 dot 7060207 at linaro dot org> <Pine dot LNX dot 4 dot 64 dot 1308292353450 dot 1487 at digraph dot polyomino dot org dot uk>
On 30 August 2013 00:58, Joseph S. Myers <joseph@codesourcery.com> wrote:
Hi Joseph,
>> A small change to the entry to the aligned copy loop improves
>> performance slightly on A9 and A15 cores for certain copies.
>
> Could you clarify what you mean by "certain copies"?
Large copies (> 16kB) where the buffers are 4-byte aligned but not
8-byte aligned. I'll respin the patch with an improved description.
> In particular, have you verified that for all three choices in this code
> (NEON, VFP or neither), the code for unaligned copies is at least as fast
> in this case (common 32-bit alignment, but not common 64-bit alignment) as
> the code that would previously have been used in those cases?
Yes, the performance is very similar but slightly better in the NEON
case and approximately unchanged in the others.
> There are various comments regarding alignment, whether stating "LDRD/STRD
> support unaligned word accesses" or referring to the mutual alignment that
> applies for particular code. Does this patch make any of them out of
> date? (If code can now only be reached with common 64-bit alignment, but
> in fact requires only 32-bit alignment, the comment should probably state
> both those things explicitly.)
I've reviewed the comments and they all look ok as far as I can tell.
--
Will Newton
Toolchain Working Group, Linaro