This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PATCH: Improve 32 bit strcat functions with SSE2/SSSE3
On Wed, Jul 20, 2011 at 05:33, Dmitrieva Liubov
<liubov.dmitrieva@gmail.com> wrote:
> I can't reproduce it on my 64bit Fedora 14 /Sandy Bridge as well.
There is (are) problem(s), though. The problem depends on the exact
alignment which is why you might not see it.
The failing strncat() call has the parameters 0x08052140, 0x080501ae,
0xffffffff. This is a test case from the tester.c tests.
If you look at the code there are two strange things. The first is
strcat-sse2.S:310
pcmpeqb (%esi), %xmm1
# ifdef USE_AS_STRNCAT
add %ecx, %ebx
# endif
pmovmskb %xmm1, %edx
shr %cl, %edx
# ifdef USE_AS_STRNCAT
cmp $16, %ebx
Here $ebx is -1, $ecx is 14. We overflow. I'm not sure whether the
following cmp test with $16 does the right thing.
The current code follows this branch and gets to line 611. The
following BRANCH_TO_JMPTBL_ENTRY invocation uses $ebx as an index.
$ebx at this point has again the value -1. This isn't correct since
this is a pointer in the ExitTable, completely unrelated.
And another problem: why do I end up in the SSE2 code in the first
place? This is a SandyBridge laptop. Model 42. This model is not
tagged as using the fast string ops etc in
sysdeps/x86_64/multiarch/init-arch.c. Is this correct? I cannot
believe that this core has problems with this code, it's an i7
processor.
Also examine the x86-64 code for the problem described above. I
haven't done that. The fact that I haven't seen any problems might
just be dumb luck.
I'll try to add some more tests to the test suite to catch this type
of problem reliably. But you have enough information to track the
problem down.