This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: horrible disk thorughput on itanium
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Ulrich Drepper <drepper at redhat dot com>
- Cc: Linus Torvalds <torvalds at transmeta dot com>, libc-alpha at sources dot redhat dot com
- Date: Fri, 7 Dec 2001 23:46:39 +0100
- Subject: Re: horrible disk thorughput on itanium
- References: <p73r8q86lpn.fsf@amdsim2.suse.de> <Pine.LNX.4.33.0112070710120.747-100000@mikeg.weiden.de> <9upmqm$7p4$1@penguin.transmeta.com> <u8k7vylu72.fsf@gromit.moeb> <m3r8q67oqd.fsf@myware.mynet>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Fri, Dec 07, 2001 at 12:35:22PM -0800, Ulrich Drepper wrote:
> Andreas Jaeger <aj@suse.de> writes:
>
> > This should be doable. We could easily implement libio/putc.c as
> > follows:
>
> That's stupid. Either you use putc_unlocked() as any sane person
> should in non-threaded code which then expands inline or you use
> __fsetlocking(FSETLOCKING_BYCALLER). Just because people cannot
> program correctly doesn't require the implementation to fix their
> problems.
But cannot we really do any better to speed the thread-safe routines
locking/unlocking?
I see the
_IO_cleanup_region_start ((void (*) __P ((void *))) _IO_funlockfile, fp);
_IO_flockfile (fp);
DO_SOMETHING
_IO_funlockfile (fp);
_IO_cleanup_region_end (0);
sequence is very common in libio, which involves (assuming no
FSETLOCKING_BYCALLER):
((fp->_flags & _IO_USER_LOCK) == 0 test
_pthread_cleanup_push_defer != NULL test (reading it from .got)
maybe _IO_funlockfile is read from .got
maybe _pthread_cleanup_push_defer call, through .plt
((fp->_flags & _IO_USER_LOCK) == 0 test (note that
_pthread_cleanup_push_defer might got called, so at least
gcc's I've tried will do this test again)
DO SOMETHING
_IO_flockfile() call through .plt
((fp->_flags & _IO_USER_LOCK) == 0 test (again, __overflow or something
might got called)
_IO_funlockfile() call through .plt
maybe call to _pthread_cleanup_pop_defer
Now, if e.g. fopen set some new FILE field (there is some pad AFAIK)
to function pointer which would do
_IO_cleanup_region_start and _IO_flockfile together (if needed)
and the other two calls, this could be simplified to:
{
struct lockme_buffer __buf;
fp->lockme(fp, &__buf);
DO SOMETHING
fp->unlockme(fp, &__buf);
}
or
{
struct lockme_buffer __buf;
if (fp->lockme) fp->lockme(fp, &__buf);
DO SOMETHING
if (fp->unlockme) fp->unlockme(fp, &__buf);
}
(whatever would appear to be faster). This could be even inlined.
Or, if this wouldn't make it any faster, cannot at least _IO_USER_LOCK be
set by default in fopen if not linked against -lpthread?
Jakub