This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.


On 08/30/2013 02:48 PM, Will Newton wrote:
> On 30 August 2013 18:14, Carlos O'Donell <carlos@redhat.com> wrote:
> 
> Hi Carlos,
> 
>>>> A small change to the entry to the aligned copy loop improves
>>>> performance slightly on A9 and A15 cores for certain copies.
>>>>
>>>> ports/ChangeLog.arm:
>>>>
>>>> 2013-08-07  Will Newton  <will.newton@linaro.org>
>>>>
>>>>         * sysdeps/arm/armv7/multiarch/memcpy_impl.S: Tighten check
>>>>         on entry to aligned copy loop for improved performance.
>>>> ---
>>>>  ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> Ping?
>>
>> How did you test the performance?
>>
>> glibc has a performance microbenchmark, did you use that?
> 
> No, I used the cortex-strings package developed by Linaro for
> benchmarking various string functions against one another[1].
> 
> I haven't checked the glibc benchmarks but I'll look into that. It's
> quite a specific case that shows the problem so it may not be obvious
> which one is better however.

If it's not obvious how is someone supposed to review this patch? :-)

> [1] https://launchpad.net/cortex-strings

There are 2 benchmarks. One appears to be dhrystone 2.1, which isn't a string 
test in and of itself which should not be used for benchmarking or changing
string functions. The other is called "multi" and appears to run some functions
in a loop and take the time. 

e.g.
http://bazaar.launchpad.net/~linaro-toolchain-dev/cortex-strings/trunk/view/head:/benchmarks/multi/harness.c

I would not call `multi' exhaustive, and while neither is the glibc performance
benchmark tests the glibc tests have received review from the glibc community
and are our preferred way of demonstrating performance gains when posting
performance patches.

I would really really like to see you post the results of running your new
implementation with this benchmark and show the numbers that claim this is
faster. Is that possible?

Cheers,
Carlos.
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]