This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi, again, let me apologize for the length of this email. This email is about the generation of power exponential pseudo random variates and its implementation inside GSL. If you are not interested in this issue, you can safely ignore it:-). For the rest of you, here's a table of contents: Sec.1 Good news: the code can be rewritten to be much faster Sec.2 Bad news: the relative performances of the methods seem to depend on the platform Section 1 --------- As I wrote in a previous email, the function for the generation of power exponential random variables actually implemented inside GSL (file randist/exppow.c) can be improved by exploiting the analytic expression of the threshold levels for the rejection method, as provided in Tadikamalla's paper (see my previous email for details). Indeed, rewriting the code, I obtained a noticeable increase in speed. Moreover, when I tested them on my laptop, the three different methods, namely the transformation of a gamma variate(GS), the rejection method from a Laplace distribution (ED) and the rejection method from a Gaussian (EN), performed as described by Tadikamalla (and as implicitly assumed inside the actual GSL implementation). Here the times for the generation of 1 million variates as a function of b using the new version of the three methods(ED can only be used for b>=1 and EN for b>=2) #b GS ED EN 0.500 0.560 nan nan 0.700 1.600 nan nan 0.900 1.500 nan nan 1.001 1.340 1.140 nan 1.250 1.350 1.120 nan 1.500 1.340 1.180 nan 1.750 1.340 1.230 nan 2.001 1.310 1.200 1.100 2.250 1.300 1.240 1.130 2.500 1.290 1.270 1.180 2.750 1.270 1.300 1.200 3.000 1.270 1.150 1.080 3.500 1.260 1.380 1.280 4.000 1.240 1.330 1.260 5.000 1.220 1.390 1.320 6.000 1.200 1.430 1.370 8.000 1.170 1.530 1.430 as can be seen the optimal method as a function of b is: b<1 GS 1<= b < 2 ED 2<= b <~ 3.5 EN B >~ 3.5 GS broadly in line with Tadikamalla's original findings. Section 2 --------- The previous results suggest for a simple rewriting of the present GSL code. However, I decided to benchmark the routines also on different machines. What I found surprised me: on AMD systems, the relative performances of the tree methods seem different. Here the results of my test program on an Athlon XP #b GS ED EN 0.500 0.560 nan nan 0.700 1.370 nan nan 0.900 1.250 nan nan 1.001 1.180 1.040 nan 1.250 1.170 1.110 nan 1.500 1.170 1.140 nan 1.750 1.160 1.200 nan 2.001 1.140 1.220 1.170 2.250 1.130 1.270 1.180 2.500 1.120 1.290 1.210 2.750 1.100 1.330 1.240 3.000 1.090 1.190 1.140 3.500 1.080 1.430 1.340 4.000 1.060 1.240 1.190 5.000 1.050 1.310 1.250 6.000 1.040 1.390 1.320 8.000 1.000 1.440 1.370 and on a dual opteron #b GS ED EN 0.500 0.300 nan nan 0.700 1.010 nan nan 0.900 0.910 nan nan 1.001 0.840 0.770 nan 1.250 0.850 0.870 nan 1.500 0.830 0.910 nan 1.750 0.820 0.950 nan 2.001 0.810 0.970 0.880 2.250 0.850 1.150 1.010 2.500 0.850 1.180 1.050 2.750 0.830 1.150 0.960 3.000 0.770 1.120 0.980 3.500 0.760 1.150 1.020 4.000 0.750 1.200 1.050 5.000 0.730 1.280 1.110 6.000 0.730 1.300 1.170 8.000 0.710 1.410 1.250 In the first case method EN is always suboptimal; in the second case method GS always runs faster. Notice that these results have been obtained running EXACTLY the same source code, that you can find in attachment. The source has always been compiled with gcc -O2 test_exppow.c -lgsl -lgslcblas -o test Moreover, on all systems, the GSL were installed with the standard configure-make-make install sequence (i.e. CFLAGS = "-g -O2"). I hope that my surprise about this finding is due to my ignorance and that someone could easily explain me the reason of these differences across systems. It would be nice if someone can run the attached program on his/her machines (just compile it with the line above and issue the command "./test" without options) and let me know about the results. In any case, I'll prepare a replacement for exppow.c with the improvements mentioned in Section 1. These should lead to FASTER programs on any system. Best, Giulio.
/* Giulio Bottazzi Mon Oct 18 2004*/ #define _GNU_SOURCE #include <math.h> #include <gsl/gsl_math.h> #include <gsl/gsl_sf_gamma.h> #include <gsl/gsl_rng.h> #include <gsl/gsl_randist.h> #include <time.h> /* Rejection method using Laplace distribution */ double exppow_ED(const gsl_rng * r, const double a, const double b) { const double A = 1./b ; const double B = pow(A,A) ; double x,y ; do{ x = gsl_ran_laplace(r,B) ; y = gsl_rng_uniform (r) ; } while(log(y) > -pow(fabs(x),b)+fabs(x)/B-1.+A); return a*x ; } /* Rejection method using Gaussian distribution */ double exppow_EN(const gsl_rng * r, const double a, const double b) { const double A = 1./b ; const double B = pow(A,A) ; double x,y ; do{ x = gsl_ran_gaussian_ratio_method (r,B) ; y = gsl_rng_uniform (r) ; } while(log(y) > -pow(fabs(x),b)+x*x/(2.*B*B)+A-.5) ; return a*x ; } /* transformation of Gamma variate */ double exppow_gamma(const gsl_rng * r, const double a, const double b) { double u = gsl_rng_uniform (r) ; double v = gsl_ran_gamma (r, 1/b, 1.0) ; double z = a * pow(v, 1/b) ; if (u > 0.5) { return z ; } else { return -z ; } } int main(int argc,char* argv[]) { unsigned i,j; /* parameters */ double a=1; double b=1; unsigned int n=1000000; /* GSL RNG */ gsl_rng * rng; long seed=0; /* the seed of the RNG */ /* time */ clock_t start, end; double gamma_time_used,ED_time_used,EN_time_used; /* b values */ const double bvals[] = {.5,.7,.9, 1.001,1.25,1.5,1.75, 2.001,2.25,2.5,2.75, 3.,3.5,4.,5.,6.,8.,10 }; const unsigned bnum=17; /* COMMAND LINE PROCESSING */ /* variables for reading command line options */ /* ------------------------------------------ */ char opt; extern char *optarg; /* extern int optind, opterr, optopt; */ extern int optopt; /* ------------------------------------------ */ /* read the command line */ while((opt=getopt(argc,argv,"a:N:R:"))!=EOF){ if(opt=='?'){ fprintf(stderr,"option %c not recognized\n",optopt); return(-1); } else if(opt=='N'){ /*number of rv to generate*/ n= (unsigned) atoi(optarg); } else if(opt=='a'){ a=atof(optarg); } else if(opt=='R'){ seed= (long) atoi(optarg); } else if(opt=='h'){ /*print short help*/ printf("Compare method to generate rv\n\n"); printf("Usage: %s [options]\n",argv[0]); printf(" Options:\n"); printf("-N\tnumber of points [100]\n"); printf("-a\tset the scale [1.0]\n"); printf("-R\tset the seed [0]\n"); exit(0); } } /* initialize RNG */ rng = gsl_rng_alloc (gsl_rng_mt19937); gsl_rng_set (rng,seed); printf("#b\tGS\tED\tEN\n"); for(j=0;j<bnum;j++){ const double b = bvals[j]; /* benchmarking gamma */ start = clock(); for (i=0;i<n;i++){ exppow_gamma(rng,a,b); } end = clock(); gamma_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; if(b>=1) { /* benchmarking ED ; valid for b>=1*/ start = clock(); for (i=0;i<n;i++){ exppow_ED(rng,a,b); } end = clock(); ED_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; } else ED_time_used=NAN; if(b>=2){/* benchmarking EN ; valid for b>=2 */ start = clock(); for (i=0;i<n;i++){ exppow_EN(rng,a,b); } end = clock(); EN_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; } else EN_time_used = NAN; /* print results */ printf("%.3f\t%.3f\t%.3f\t%.3f\n",b, gamma_time_used,ED_time_used,EN_time_used); } }
Attachment:
pgp00000.pgp
Description: PGP signature
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |