This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Split mantissa calculation loop and add branch predictionto mp multiplication
- From: Andreas Jaeger <aj at suse dot com>
- To: Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 01 Jan 2013 14:30:07 +0100
- Subject: Re: [PATCH] Split mantissa calculation loop and add branch predictionto mp multiplication
- References: <20121231092850.GA21621@spoyarek.pnq.redhat.com>
On 12/31/2012 10:28 AM, Siddhesh Poyarekar wrote:
Hi,
The loop inside __mul in mpa.c is slow because of the branch
evaluation within the loop. Attached patch splits the loop to make
the loop simpler to evaluate and also adds some branch prediction to
improve performance. I have verified that this does not break the
testsuite.
I checked for performance improvement with this patch using the test
program here:
http://entropymine.com/imageworsener/slowpow/
with the following commandline:
time ./powtest 1000000 1.0000000000000020 1.5000000000500000
Taking into account the best times with and without the patch (with
more than 10 runs each), the improvement is about a quarter of a
second.
OK to commit?
This looks fine after fixing a minor style issue (see below).
Siddhesh
ChangeLog:
* sysdeps/ieee754/dbl-64/mpa.c (__mul): Split mantissa
calculation loop and add branch prediction.
diff --git a/sysdeps/ieee754/dbl-64/mpa.c b/sysdeps/ieee754/dbl-64/mpa.c
index 78afd99..278fdb6 100644
--- a/sysdeps/ieee754/dbl-64/mpa.c
+++ b/sysdeps/ieee754/dbl-64/mpa.c
@@ -447,33 +447,51 @@ void
SECTION
__mul(const mp_no *x, const mp_no *y, mp_no *z, int p) {
- int i, i1, i2, j, k, k2;
+ int i, j, k, k2;
double u;
- /* Is z=0? */
- if (X[0]*Y[0]==ZERO)
- { Z[0]=ZERO; return; }
-
- /* Multiply, add and carry */
- k2 = (p<3) ? p+p : p+3;
- Z[k2]=ZERO;
- for (k=k2; k>1; ) {
- if (k > p) {i1=k-p; i2=p+1; }
- else {i1=1; i2=k; }
- for (i=i1,j=i2-1; i<i2; i++,j--) Z[k] += X[i]*Y[j];
-
- u = (Z[k] + CUTTER)-CUTTER;
- if (u > Z[k]) u -= RADIX;
- Z[k] -= u;
- Z[--k] = u*RADIXI;
- }
+ /* Is z=0? */
+ if (__glibc_unlikely (X[0] * Y[0] == ZERO))
+ {
+ Z[0]=ZERO;
+ return;
+ }
- /* Is there a carry beyond the most significant digit? */
- if (Z[1] == ZERO) {
- for (i=1; i<=p; i++) Z[i]=Z[i+1];
- EZ = EX + EY - 1; }
- else
- EZ = EX + EY;
+ /* Multiply, add and carry */
+ k2 = (__glibc_unlikely (p < 3)) ? p + p : p + 3;
+ Z[k2] = ZERO;
+
+ for (k = k2; k > p; )
+ {
+ for (i = k - p, j = p; i < p + 1; i++, j--)
+ Z[k] += X[i] * Y[j];
+
+ u = (Z[k] + CUTTER)-CUTTER;
+ if (u > Z[k]) u -= RADIX;
+ Z[k] -= u;
+ Z[--k] = u * RADIXI;
compare the formatting of the above code with the one below - and let's
use consistently the better in both places:
if (u > Z[k])
u -= RADIX;
Z[k] -= u;
Z[--k] = u * RADIXI;
+ }
+
+ for (; k > 1; )
+ {
+ for (i = 1,j = k - 1; i < k; i++, j--)
+ Z[k] += X[i] * Y[j];
+
+ u = (Z[k] + CUTTER)-CUTTER;
+ if (u > Z[k])
+ u -= RADIX;
+ Z[k] -= u;
+ Z[--k] = u*RADIXI;
+ }
+
+ EZ = EX + EY;
+ /* Is there a carry beyond the most significant digit? */
+ if (__glibc_unlikely (Z[1] == ZERO))
+ {
+ for (i = 1; i <= p; i++)
+ Z[i] = Z[i+1];
+ EZ--;
+ }
Z[0] = X[0] * Y[0];
}
Thanks,
Andreas
--
Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126