This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: Inefficient ia64 system call implementation in glibc
- From: John Worley <jworley at fc dot hp dot com>
- To: GNU C Library <libc-alpha at sources dot redhat dot com>,linux ia64 kernel <linux-ia64 at vger dot kernel dot org>
- Date: Fri, 19 Sep 2003 15:46:24 -0600
- Subject: Re: Inefficient ia64 system call implementation in glibc
- Organization: Hewlett Packard Labs, Ft. Collins
- References: <20030919163218.GA21480@lucon.org>
- Reply-to: john dot worley at hp dot com
H.J. Lu <hjl@lucon.org> write:
The inline ia64 system call assumes all values passed to kernel are
signed 64bit. It does sign extension if the incoming arg is not signed
64bit. In case of fxstat.c:
int
__fxstat (int vers, int fd, struct stat *buf)
{
return INLINE_SYSCALL (fstat, 2, fd, CHECK_1 (buf));
}
it leads to
0000000000000000 <__fxstat>:
0: 00 20 39 0c 80 05 [MII] alloc r36=ar.pfs,14,6,0
6: f0 e0 01 12 48 a0 mov r15=1212
c: 04 08 00 84 mov r37=r1
10: 01 38 01 44 00 21 [MII] mov r39=r34
16: 60 02 84 2c 00 60 sxt4 r38=r33
^^^^^^^^^^^^^
1c: 04 00 c4 00 mov r35=b0;;
20: 0a 00 00 00 00 02 [MMI] break.m 0x100000;;
26: 10 02 20 00 42 e0 mov r33=r8
The real inefficiency here is the compiler output. Given the
realities of the Itanium 2 implementation, the first two bundles
will require 3 cycles to execute. A better coding would be:
{ .mmi
alloc r36=ar.pfs,14,6,0
mov r15=1212
mov r35=b0
}
{ .mmi
mov r37=r1
mov r39=r34
sxt4 r38=r33
} ;;
which will execute in one cycle. The sign extension, although
"unnecessary" doesn't cost any cycles. Admittedly you could use the
mi;;i bundle to pack the break instruction in the second bundle if
you didn't have to sign-extend, but I'd rather see the 3 v. 1 cycle
problem addressed first.
Regards,
John "I worry about this stuff way too much" Worley
john.worley@hp.com