This is the mail archive of the
gsl-discuss@sources.redhat.com
mailing list for the GSL project.
Speed Issues
- From: David Ronis <ronis at ronispc dot chem dot mcgill dot ca>
- To: gsl-discuss at sources dot redhat dot com
- Date: Tue, 18 Dec 2001 13:03:09 -0500
- Subject: Speed Issues
- Reply-to: ronis at onsager dot chem dot mcgill dot ca
I've compiled gsl-1.0 on an i686-Linux-gnu and on a dual athlon boxes,
each with it's own local build of the atlas blas routines. In the one
application we've tried, I notice about a 30% slowdown (on either box)
compared to the same code compiled with IMSL routines. All the
libraries and code were compiled with gcc-2.95.3 and my application
had GSL_RANGE_CHECK_OFF and HAVE_INLINE defined (the speed difference
is not that large even if it wasn't)
Specifically, the application has to solve for the roots of about 600
coupled nonlinear equations, and we've been using the following code:
const gsl_multiroot_fsolver_type *T;
gsl_multiroot_fsolver *sss;
int status;
size_t iii, iter = 0;
double x_init[3*Ntarget];
const size_t nnn = 3*Ntarget;
struct rparams ppp = {1.0, 10.0};
gsl_multiroot_function f = {&rosenbrock_f, nnn, &ppp};
for(i = 0; i < 3*Ntarget; i++)
x_init[i] = 0.0;
gsl_vector *x = gsl_vector_alloc (nnn);
for(i = 0;i < 3*Ntarget; i++)
gsl_vector_set (x, i, x_init[i]);
T = gsl_multiroot_fsolver_hybrids;
sss = gsl_multiroot_fsolver_alloc (T, nnn);
start = clock();
gsl_multiroot_fsolver_set (sss, &f, x);
do
{
iter++;
status = gsl_multiroot_fsolver_iterate (sss);
if (status) /* check if solver is stuck */
break;
status=gsl_multiroot_test_delta (sss->dx, sss->x, 0.0, 1.0e-6);
}
while (status == GSL_CONTINUE && iter < 1000);
elapsed_time += (double)(clock()-start)/CLOCKS_PER_SEC;
I compile with the following flags:
-O3 -march=i686 -ffast-math -funroll-loops -fomit-frame-pointer
-fforce-mem -fforce-addr -malign-jumps=3 -malign-loops=3
-malign-functions=3 -mpreferred-stack-boundary=3
and link with the atlas blas routines.
I've also tried the IMSL routine ZSPOW (written in fortran from an
early version of the IMSL library). As I mentioned at the outset, the
gsl version is about 30% slower, although the two give identical
roots.
Any suggestions? I've played around eliminating some of the
additional indirection associated with having general code for
arbitrary strides (e.g., by manipulating the data members of the
gsl_vector directly, assuming stride=1), but this only speeds things up
slightly.
David
P.S., it doesn't seem to be in the documentation, but is there any
convention as to what the initial stride of a gsl_vector is? When can
I assume that it's 1 and will remain so?