This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Rewritten v9/64-bit sparc strcmp.


On Wed, Aug 24, 2011 at 01:38:54AM -0700, David Miller wrote:
> 
> This new code is heavily inspired by the powerpc 64-bit base strcmp.
> 
> It's faster than the existing code, especially on Niagara cpus as
> the number of branches has been minimized to reduce cpu thread
> switching.
> 
> On UltraSPARC-3 the tail code executes in a constant 5 cycles,
> regardless of where the mismatching/zero byte is.  The main aligned
> loop executes in 4 cycles, which with a 2 cycle load latency is
> essentially optimal.
> 
> Committed to master.
> 
> ---
>  ChangeLog                      |    4 +
>  sysdeps/sparc/sparc64/strcmp.S |  416 ++++++++++++++++------------------------
>  2 files changed, 173 insertions(+), 247 deletions(-)

[snip]

> +.Lcommon_equal:
> +	retl
> +	 mov	0, %o0
> +
> +	/* All loops terminate here once they find an unequal word.
> +	 * If a zero byte appears in the word before the first unequal
> +	 * byte, we must report zero.  Otherwise we report '1' or '-1'
> +	 * depending upon whether the first mis-matching byte is larger
> +	 * in the first string or the second, respectively.
> +	 *
> +	 * First we compute a 64-bit mask value that has "0x01" in
> +	 * each byte where a zero exists in rWORD1.  rSTRXOR holds the
> +	 * value (rWORD1 ^ rWORD2).  Therefore, if considered as an
> +	 * unsigned quantity, our "0x01" mask value is "greater than"
> +	 * rSTRXOR then a zero terminating byte comes first and
> +	 * therefore we report '0'.
> +	 *
> +	 * The formula for this mask is:
> +	 *
> +	 *    mask_tmp1 = ~rWORD1 & 0x8080808080808080;
> +	 *    mask_tmp2 = ((rWORD1 & 0x7f7f7f7f7f7f7f7f) +
> +	 *                 0x7f7f7f7f7f7f7f7f);
> +	 *
> +	 *    mask = ((mask_tmp1 & ~mask_tmp2) >> 7);

This method doesn't work when comparing a 0x00 char in string 1 and 0x01
char in string 2. In that case the mask for this byte is 0x01 and the
corresponding xor is also  0x01. The result of the comparison therefore
depends on the garbage after the end of the string.

On Debian [1] this causes for example debian-installer to fail to build
[2], and it might be the source of the random segfaults which we are
trying to debug for a few years.

[1] http://bugs.debian.org/746310
[2] http://bugs.debian.org/731806

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]