This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

random variate from power exponential distribution: continue


Hi,
again, let me apologize for the length of this email. This email is
about the generation of power exponential pseudo random variates and its
implementation inside GSL. If you are not interested in this issue, you
can safely ignore it:-). For the rest of you, here's a table of
contents:

Sec.1 Good news: the code can be rewritten to be much faster

Sec.2 Bad news: the relative performances of the methods seem to depend
on the platform

Section 1
---------

As I wrote in a previous email, the function for the generation of power
exponential random variables actually implemented inside GSL (file
randist/exppow.c) can be improved by exploiting the analytic expression
of the threshold levels for the rejection method, as provided in
Tadikamalla's paper (see my previous email for details). Indeed,
rewriting the code, I obtained a noticeable increase in speed. Moreover,
when I tested them on my laptop, the three different methods, namely the
transformation of a gamma variate(GS), the rejection method from a
Laplace distribution (ED) and the rejection method from a Gaussian (EN),
performed as described by Tadikamalla (and as implicitly assumed inside
the actual GSL implementation). Here the times for the generation of 1
million variates as a function of b using the new version of the three
methods(ED can only be used for b>=1 and EN for b>=2)

#b      GS      ED      EN
0.500   0.560   nan     nan
0.700   1.600   nan     nan
0.900   1.500   nan     nan
1.001   1.340   1.140   nan
1.250   1.350   1.120   nan
1.500   1.340   1.180   nan
1.750   1.340   1.230   nan
2.001   1.310   1.200   1.100
2.250   1.300   1.240   1.130
2.500   1.290   1.270   1.180
2.750   1.270   1.300   1.200
3.000   1.270   1.150   1.080
3.500   1.260   1.380   1.280
4.000   1.240   1.330   1.260
5.000   1.220   1.390   1.320
6.000   1.200   1.430   1.370
8.000   1.170   1.530   1.430


as can be seen the optimal method as a function of b is:

    b<1       GS
1<= b < 2     ED
2<= b <~ 3.5  EN
    B >~ 3.5  GS

broadly in line with Tadikamalla's original findings.

Section 2
---------

The previous results suggest for a simple rewriting of the present GSL
code. However, I decided to benchmark the routines also on different
machines. What I found surprised me: on AMD systems, the relative
performances of the tree methods seem different. Here the results of my
test program on an Athlon XP

#b      GS      ED      EN
0.500   0.560   nan     nan
0.700   1.370   nan     nan
0.900   1.250   nan     nan
1.001   1.180   1.040   nan
1.250   1.170   1.110   nan
1.500   1.170   1.140   nan
1.750   1.160   1.200   nan
2.001   1.140   1.220   1.170
2.250   1.130   1.270   1.180
2.500   1.120   1.290   1.210
2.750   1.100   1.330   1.240
3.000   1.090   1.190   1.140
3.500   1.080   1.430   1.340
4.000   1.060   1.240   1.190
5.000   1.050   1.310   1.250
6.000   1.040   1.390   1.320
8.000   1.000   1.440   1.370


and on a dual opteron

#b      GS      ED      EN
0.500   0.300   nan     nan
0.700   1.010   nan     nan
0.900   0.910   nan     nan
1.001   0.840   0.770   nan
1.250   0.850   0.870   nan
1.500   0.830   0.910   nan
1.750   0.820   0.950   nan
2.001   0.810   0.970   0.880
2.250   0.850   1.150   1.010
2.500   0.850   1.180   1.050
2.750   0.830   1.150   0.960
3.000   0.770   1.120   0.980
3.500   0.760   1.150   1.020
4.000   0.750   1.200   1.050
5.000   0.730   1.280   1.110
6.000   0.730   1.300   1.170
8.000   0.710   1.410   1.250

In the first case method EN is always suboptimal; in the second case
method GS always runs faster.

Notice that these results have been obtained running EXACTLY the same
source code, that you can find in attachment. The source has always been
compiled with

gcc -O2 test_exppow.c -lgsl -lgslcblas -o test

Moreover, on all systems, the GSL were installed with the standard
configure-make-make install sequence (i.e. CFLAGS = "-g -O2").


I hope that my surprise about this finding is due to my ignorance and
that someone could easily explain me the reason of these differences
across systems. It would be nice if someone can run the attached program
on his/her machines (just compile it with the line above and issue the
command "./test" without options) and let me know about the results. In
any case, I'll prepare a replacement for exppow.c with the
improvements mentioned in Section 1. These should lead to FASTER
programs on any system.

Best,
	Giulio.

/* Giulio Bottazzi  Mon Oct 18 2004*/

#define _GNU_SOURCE

#include <math.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_sf_gamma.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <time.h>

/* Rejection method using Laplace distribution */
double
exppow_ED(const gsl_rng * r, const double a, const double b)
{
  
  const double A =  1./b ;
  const double B = pow(A,A) ;
  double x,y ;
  
  do{
    x = gsl_ran_laplace(r,B) ;
    y = gsl_rng_uniform (r) ;	
  }
  while(log(y) > -pow(fabs(x),b)+fabs(x)/B-1.+A);
  
  return a*x ;
  
}

/* Rejection method using Gaussian distribution */
double
exppow_EN(const gsl_rng * r, const double a, const double b)
{

  const double A =  1./b ;
  const double B = pow(A,A) ;
  double x,y ;
  
  do{
    x = gsl_ran_gaussian_ratio_method (r,B) ;
    y = gsl_rng_uniform (r) ;
  }
  while(log(y) > -pow(fabs(x),b)+x*x/(2.*B*B)+A-.5) ;
  
  return a*x ;
  
}

/* transformation of Gamma variate */
double
exppow_gamma(const gsl_rng * r, const double a, const double b)
{
  
  
  double u = gsl_rng_uniform (r) ;
  double v = gsl_ran_gamma (r, 1/b, 1.0) ;
  double z = a * pow(v, 1/b) ;
  
  if (u > 0.5) 
    {
      return z ;
    } 
  else 
    {
      return -z ;
    }
  
}


int
main(int argc,char* argv[])
{

  unsigned i,j;
  
  /* parameters */
  double a=1;
  double b=1;
  unsigned int n=1000000;

  /* GSL RNG */
  gsl_rng * rng;
  long seed=0; /* the seed of the RNG */

  /* time */
  clock_t start, end;
  double gamma_time_used,ED_time_used,EN_time_used;
 
  /* b values */
  const double bvals[]
    = {.5,.7,.9,
       1.001,1.25,1.5,1.75,
       2.001,2.25,2.5,2.75,
       3.,3.5,4.,5.,6.,8.,10
    };
  const unsigned bnum=17;
  
  
  /* COMMAND LINE PROCESSING */
  
  /* variables for reading command line options */
  /* ------------------------------------------ */
  char opt;
  extern char *optarg;
/*   extern int optind, opterr, optopt; */
  extern int optopt;
  /* ------------------------------------------ */
    
    
  /* read the command line */    
  while((opt=getopt(argc,argv,"a:N:R:"))!=EOF){
	if(opt=='?'){
	    fprintf(stderr,"option %c not recognized\n",optopt);
	    return(-1);
	}
	else if(opt=='N'){
	  /*number of rv to generate*/
	  n= (unsigned) atoi(optarg);
	}
	else if(opt=='a'){
	  a=atof(optarg);
	}
	else if(opt=='R'){
	  seed= (long) atoi(optarg);
	}
	else if(opt=='h'){
	    /*print short help*/
	  printf("Compare method to generate rv\n\n");
	  printf("Usage: %s [options]\n",argv[0]);
	  printf(" Options:\n");
	  printf("-N\tnumber of points [100]\n");
	  printf("-a\tset the scale [1.0]\n");
	  printf("-R\tset the seed [0]\n");
	  exit(0);
	}
    }

  /* initialize RNG */
  rng = gsl_rng_alloc (gsl_rng_mt19937);
  gsl_rng_set (rng,seed);

  printf("#b\tGS\tED\tEN\n");

  for(j=0;j<bnum;j++){

    const double b = bvals[j];
    
    /* benchmarking gamma */
    start = clock();
    for (i=0;i<n;i++){
      exppow_gamma(rng,a,b);
    }
    end = clock();
    gamma_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
    
    if(b>=1) { /* benchmarking ED ; valid for b>=1*/
      start = clock();
      for (i=0;i<n;i++){
	exppow_ED(rng,a,b);
      }
      end = clock();
      ED_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
    }
    else ED_time_used=NAN;


    if(b>=2){/* benchmarking EN ; valid for b>=2 */
      start = clock();
      for (i=0;i<n;i++){
	exppow_EN(rng,a,b);
      }
      end = clock();
      EN_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
    }
    else EN_time_used = NAN;


    /* print results */
    printf("%.3f\t%.3f\t%.3f\t%.3f\n",b,
	   gamma_time_used,ED_time_used,EN_time_used);
  }
  

}

Attachment: pgp00000.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]