This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Speed Issues

From: David Ronis <ronis at ronispc dot chem dot mcgill dot ca>
To: gsl-discuss at sources dot redhat dot com
Date: Tue, 18 Dec 2001 13:03:09 -0500
Subject: Speed Issues
Reply-to: ronis at onsager dot chem dot mcgill dot ca


I've compiled gsl-1.0 on an i686-Linux-gnu and on a dual athlon boxes,
each with it's own local build of the atlas blas routines.  In the one
application we've tried, I notice about a 30% slowdown (on either box)
compared to the same code compiled with IMSL routines.  All the
libraries and code were compiled with gcc-2.95.3 and my application
had GSL_RANGE_CHECK_OFF and HAVE_INLINE defined (the speed difference
is not that large even if it wasn't)

Specifically, the application has to solve for the roots of about 600
coupled nonlinear equations, and we've been using the following code:

  const gsl_multiroot_fsolver_type *T;
  gsl_multiroot_fsolver *sss;
  int status;
  size_t iii, iter = 0;
  double x_init[3*Ntarget];
  
  const size_t nnn = 3*Ntarget;
  struct rparams ppp = {1.0, 10.0};
  gsl_multiroot_function f = {&rosenbrock_f, nnn, &ppp};
  
  for(i = 0; i < 3*Ntarget; i++)
    x_init[i] = 0.0;
  gsl_vector *x = gsl_vector_alloc (nnn);
  
  for(i = 0;i < 3*Ntarget; i++)
    gsl_vector_set (x, i, x_init[i]);
  
  T = gsl_multiroot_fsolver_hybrids;
  sss = gsl_multiroot_fsolver_alloc (T, nnn);

  start = clock();
  
  gsl_multiroot_fsolver_set (sss, &f, x);

  do
    {

      iter++;
      status = gsl_multiroot_fsolver_iterate (sss);
      if (status)   /* check if solver is stuck */
	break;
      status=gsl_multiroot_test_delta (sss->dx, sss->x, 0.0, 1.0e-6);

    }
  while (status == GSL_CONTINUE && iter < 1000);

  elapsed_time += (double)(clock()-start)/CLOCKS_PER_SEC;

I compile with the following flags:

  -O3 -march=i686 -ffast-math -funroll-loops -fomit-frame-pointer
  -fforce-mem -fforce-addr -malign-jumps=3 -malign-loops=3
  -malign-functions=3 -mpreferred-stack-boundary=3

and link with the atlas blas routines.

I've also tried the IMSL routine ZSPOW (written in fortran from an
early version of the IMSL library).  As I mentioned at the outset, the
gsl version is about 30% slower, although the two give identical
roots.


Any suggestions?  I've played around eliminating some of the
additional indirection associated with having general code for
arbitrary strides (e.g., by manipulating the data members of the
gsl_vector directly, assuming stride=1), but this only speeds things up
slightly.


David

P.S., it doesn't seem to be in the documentation, but is there any
convention as to what the initial stride of a gsl_vector is?  When can
I assume that it's 1 and will remain so?

Follow-Ups:
- Re: Speed Issues
  - From: Brian Gough

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]