This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [ARM] Optimised strchr and strlen
On 24 December 2011 21:01, Richard Henderson <rth@twiddle.net> wrote:
> On 12/23/2011 12:31 PM, David Gilbert wrote:
>> Sure; it's pretty much the same trick as my strlen routine.
> ...
>> OK, so I gave that a go - and the results are:
>
> I can't help but wonder if just the one branch in the first loop is best.
Yes.
> Also, it appears one can use uqadd8 and do the aligned two words in parallel
> rather than having everything serialize on the GT flags and SEL.
>
> I've run this through glibc's test-strchr, but havn't gotten around to
> benchmarking it at all. ?Since you've already got that set up, perhaps
> you could give it a whirl.
Here we go - you're code is the green line; rth_strchr - your uqadd8
trick is very nice;
the peak speed is a nice bit higher than my version using a set of uadd8's and
sel (you get 1 instruction less in the main loop).
https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/InitialStrchr?action=AttachFile&do=view&target=strchr-withrth-strchr-abs.png
The simple routine is still easily winning below 32 bytes though, and
there is still that odd notch at 16.
(I think your uqadd8 trick would be a nice improvement on my strlen
and memchr routines).
Dave