[PATCH v1 2/2] x86: Remove unnecessary overflow check from wcsnlen-sse4_1.S

H.J. Lu hjl.tools@gmail.com
Thu Jun 24 00:41:53 GMT 2021


On Wed, Jun 23, 2021 at 5:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> No bug. The way wcsnlen will check if near the end of maxlen
> is the following macro:
>
>         mov     %r11, %rsi;     \
>         subq    %rax, %rsi;     \
>         andq    $-64, %rax;     \
>         testq   $-64, %rsi;     \
>         je      L(strnlen_ret)
>
> Which words independently of s + maxlen overflowing. So the
> second overflow check is unnecissary for correctness and
> just extra overhead in the common no overflow case.
>
> test-strlen.c, test-wcslen.c, test-strnlen.c and test-wcsnlen.c are
> all passing
>
> Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> ---
> Sorry I didn't notice this earlier before my last commit. As
> of submitting this patch
>
>         a775a7a3eb1e85b54af0b4ee5ff4dcf66772a1fb
>
> Is HEAD of master to maybe rebase so commit history isnt messy?
>
>
>  sysdeps/x86_64/multiarch/strlen-vec.S | 7 -------
>  1 file changed, 7 deletions(-)
>
> diff --git a/sysdeps/x86_64/multiarch/strlen-vec.S b/sysdeps/x86_64/multiarch/strlen-vec.S
> index 439e486a43..b7657282bd 100644
> --- a/sysdeps/x86_64/multiarch/strlen-vec.S
> +++ b/sysdeps/x86_64/multiarch/strlen-vec.S
> @@ -71,19 +71,12 @@ L(n_nonzero):
>     suffice.  */
>         mov     %RSI_LP, %R10_LP
>         sar     $62, %R10_LP
> -       test    %R10_LP, %R10_LP
>         jnz     __wcslen_sse4_1
>         sal     $2, %RSI_LP
>  # endif
>
> -
>  /* Initialize long lived registers.  */
> -
>         add     %RDI_LP, %RSI_LP
> -# ifdef AS_WCSLEN
> -/* Check for overflow again from s + maxlen * sizeof(wchar_t).  */
> -       jbe     __wcslen_sse4_1
> -# endif
>         mov     %RSI_LP, %R10_LP
>         and     $-64, %R10_LP
>         mov     %RSI_LP, %R11_LP
> --
> 2.25.1
>

LGTM.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

Thanks.

-- 
H.J.


More information about the Libc-alpha mailing list