This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Proposal for CPU dispatching in libc
- From: Agner Fog <agner at agner dot org>
- To: libc-help <libc-help at sourceware dot org>
- Cc: lucaregini at yahoo dot it
- Date: Thu, 02 Jul 2009 10:19:18 +0200
- Subject: Proposal for CPU dispatching in libc
- References: <4892AB88.9040905@agner.org> <1217955154.7784.21.camel@localhost> <48995AF3.6060105@agner.org> <1218049581.7809.60.camel@localhost> <489A8F1F.7050702@agner.org> <489ACA0D.2090705@agner.org> <119aab440808070431t1a935240i948a1206b720bfe@mail.gmail.com> <489AE2B2.9010703@agner.org> <119aab440808070533q1897acc9kb0223b09d64e1922@mail.gmail.com> <489BE528.7090807@agner.org> <119aab440808080522ne73089al9a0d36b3befe24bd@mail.gmail.com> <489D537F.3010005@agner.org>
Last year I proposed to improve the string functions and various other
functions in libc (see the thread "Why do you want libc to be 5 times
slower than other libraries?", August 2008). However, due to lack of
volunteers this was never implemented. Now Luca Regini has volunteered
to help with this work so I am taking up the issue again. We are
considering improving string and memory functions, and perhaps math
functions, in libc for x86 and x86-64.
Before we start to make any patches, we have to agree on a general
framework for CPU dispatching as explained below. This is gonna be a
long discussion because there are a lot of details and issues that have
to be discussed and agreed upon. Please allow this discussion to proceed
on the libc-alpha mailing list. libc-help is too noisy an environment
and intended mainly for libc users, not for developers.
Explanation of CPU-dispatching: You are making two or more versions of a
particular function for different instruction sets. For example, you
make two different versions of the sin function in x86 libc, one for the
old x87 instruction set and one that uses the newer SSE2 instruction
set. Both are included in the library. At runtime, it checks whether the
CPU and the operating system supports SSE2. If so, it runs the more
efficient SSE2 version of sin, if not it runs the x87 version which is
compatible with old CPUs.
There appears to be very little CPU-dispatching in libc. The only thing
I have encountered is the memcpy function using different branches
depending on cache sizes. Are there any guidelines for how to do
CPU-dispatching? If not, we will have to develop a framework and a set
of guidelines for CPU-dispatching.
The CPU-dispatching framework might ideally have the following features:
* Be as portable as possible, but some aspects obviously specific to x86
and x86-64 platforms
* Allow application to any function in libc
* Allow easy extension to future instruction sets
* Be testable. It should be possible to test all versions of a function
on a single computer, provided, of course, that the computer supports
all the necessary instruction sets (or is capable of emulating them).
* Possibility of bypassing the cpu-dispatching. If a program, or part of
it, is compiled for a specific cpu or instuction set then the compiler
might insert calls directly to the specific version of the function
rather than to the dispatch entry. The other versions of the function
need not be linked in on static linking. This needs a naming convention
for instruction-set-specific function versions. The compiler needs
information on which function-versions are available in the specific
version of libc used.
* Possibility of building lean versions of libc without support for
certain instruction sets.
* Possibility of building .so versions of libc supporting only a
specific CPU.
* Possibility of integration with Gnu compilers. I don't think gcc has a
feature for cpu-dispatching of user code,
but it might have such a feature added in the future and glibc should be
compatible with such a future gcc feature.
It may not be feasible to include all these features, but this has to be
discussed and we may prepare the framework for adding such features in
the future.
Again, I propose that we move this discussion to the libc-alpha list.