This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] Faster strchr implementation.
- From: Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>
- To: Ondřej Bílka <neleai at seznam dot cz>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 16 Aug 2013 15:17:58 +0400
- Subject: Re: [PATCH v2] Faster strchr implementation.
- References: <20130807140911 dot GA31968 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ926EE-MYDJR5Eftf+DUefBg-Gox0pw57vZ7XUwsO3OPJg at mail dot gmail dot com> <20130816095908 dot GA15776 at domone dot kolej dot mff dot cuni dot cz>
Can you please attach the patch itself to review including style issues.
--
Liubov Dmitrieva
On Fri, Aug 16, 2013 at 1:59 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Thu, Aug 08, 2013 at 10:22:36PM +0400, Liubov Dmitrieva wrote:
> A strchr will need to rerun tests.
>
> Hi, I tuned my implementation do decrease loop overhead. It decreases
> loop overhead by significant constant factor over previous
> implementation.
>
> There are architectures that I do not cover,
> haswell - an avx2 implementation that I posted is better and it is
> better posted separately.
>
> atom - An loop caused big overhead for sizes around 64 bytes and we need
> work on more effective header, we keep no-bsf implementation for now.
>
> silvermont - similar issues as atom but we need separate IFUNC casing to
> choose no-bsf variant
>
> athlon,athlon x2 - Same situation an we also need flag to choose other variant.
>
> I updated results of my profiler.
> In my random test strchr would always find terminating zero. This does
> not happen in practice so now strchr will find character with 50%
> probability.
>
> http://kam.mff.cuni.cz/~ondra/benchmark_string/strchr_profile.html
> http://kam.mff.cuni.cz/~ondra/benchmark_string/strchr_profile160813.tar.bz2
>
>
> OK to commit this iteration?
>
> * sysdeps/x86_64/multiarch/ifunc-impl-list.c
> (__libc_ifunc_impl_list): Remove: __strchr_sse42.
> * sysdeps/x86_64/multiarch/strchr.S (__strchr_sse42): Remove.
> (strchr): Remove __strchr_sse42 ifunc selection.
> * sysdeps/x86_64/strchr.S (strchr): Use optimized implementation.
> * sysdeps/x86_64/strchrnul.S: Include sysdeps/x86_64/strchr.S.
>
>
> diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> index 28d3579..8486294 100644
> --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c