This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: multivariate gaussian distribution (Code)

From: Przemyslaw Sliwa <sliwa at euv-frankfurt-o dot de>
To: <gsl-discuss at sources dot redhat dot com>
Date: Tue, 30 Dec 2003 00:02:06 +0100 (MET)
Subject: Re: multivariate gaussian distribution (Code)

Hi Emmanuel,

Ok. First thing - i hardly believe that it was written somewhere that
the eigendecomposition is more stable than the Cholesky one. The
Cholesky decomposition i so stable that it can be used to determine if
the matrix is positive definite or not (see "Numerical Reciepes in
..."). The only problem might arise when the eigenvalues of the matrix
are very close to zero. The NAG uses some special matrix balancing
procedure. This is what is missing in GSL.

I personally use the multivariate normal distribution for generating a
multivariate GARCH model. In my case the multivariate time series Xt has
(conditionally) normal distribution. This means that the covariance matrix
chabges all the time. Generating a time series of length 60 I have to
compute 60 various covariance matices. Since I use the monte carlo I
repeat this experiment many times (several milions). In one run I have to
compute about 1bln of random vectors which are normally distributed. Of
course additionally I have to decompose 1bln of covariance matrices. Now,
imagine I have to use the inefficient EIGNECODE from GSL. He, he, he...
Hundreds of years. And with the quick Cholesky code just couple of hours.

Now "my" algorithm:

1. suppose that one uses the nacr() function in order to generate the
(independent) random vector. Due to the property of the multivariate
normal distribution (if all marginal distributions are independent normal
then the join distribution is multivariate normal too) one just generates
the vector V[k] as

  for(i=0; i<DIMENSION; i++) V[i]=nacr();
Now the vector has the multivariate distribution with the cvariance matrix
given by the isentity matrix.
2. Decompose the desired covariance matrix Cov using the Cholesky
decomposition obtainig the lower triangular matrix L, so that Cov=LL'
3. Now the new vector P = LV is normally distributed with mean 0 and
covariance matrix Cov, formally P~N(0,Cov).

This algorithm can be applied only to the normal distribution since it
uses the above property. It holds only for the multivariate normal
distribution and for other distributions another algorithm must be used.
But I do not know if you are interested in it. Below one piece of code
which could help you to understand the three steps.

Cheeres,

Przem

      int rl;

      do
	{
	  rl++;
	  for (i = 0; i < PERMUTATION; i++)
	    gsl_vector_set (LAST_NU, i,
			    gsl_vector_get (LAST_E,
					    a[i]) * gsl_vector_get (LAST_E,
								    b[i]));

	  for (size_t ii = 0; ii < PERMUTATION; ii++)
	    {
	      gsl_vector_set (vector_temp_1, ii,
			      gsl_matrix_get (A, ii,
					      ii) * gsl_vector_get (LAST_NU,
								    ii));
	      gsl_vector_set (vector_temp_2, ii,
			      gsl_matrix_get (B, ii,
					      ii) * gsl_vector_get (LAST_h,
								    ii));
	      gsl_vector_set (h, ii,
			      gsl_vector_get (omega,
					      ii) +
			      gsl_vector_get (vector_temp_1,
					      ii) +
			      gsl_vector_get (vector_temp_2, ii));
	    }

	  index = 1;

	  for (long unsigned ii = 1; ii <= size; ii++)
	    for (long unsigned jj = ii; jj <= size; jj++, index++)
	      gsl_matrix_set (H, jj - 1, ii - 1,
			      gsl_vector_get (h, index - 1));

	  for (long unsigned ii = 0; ii < size; ii++)
	    for (long unsigned jj = ii; jj < size; jj++)
	      gsl_matrix_set (H, ii, jj, gsl_matrix_get (H, jj, ii));

	  ifail = gsl_linalg_cholesky_decomp (H);
	  assert (ifail == 0);

	  for (i = 0; i < DIMENSION; i++)
	    gsl_vector_set (XI, i, nacr ());

	  gsl_vector_memcpy (E, XI);

	  gsl_blas_dtrmv (CblasLower, CblasNoTrans, CblasNonUnit, H, E);
	  gsl_vector_memcpy (OBSERVED_E, E);

	  for (i = 0; i < PERMUTATION; i++)
	    gsl_vector_set (z, i,
			    (1.0 - lambda) * gsl_vector_get (z,
							     i) +
			    lambda * (gsl_vector_get (OBSERVED_E, a[i]) *
						  gsl_vector_get (OBSERVED_E, b[i])));

	  for (i = 0; i < PERMUTATION; i++)
	    gsl_vector_set (CENTRED, i,
			    gsl_vector_get (z, i) - gsl_vector_get (MU_TAU,
								    i));

	  if (rl <= MAXLENGTH)
	    gsl_blas_dgemv (CblasNoTrans, 1.0, SIGMA_INVERSE[rl - 1],
			    CENTRED, 0.0, vector_temp_1);
	  else
	    gsl_blas_dgemv (CblasNoTrans, 1.0, SIGMA_INVERSE[MAXLENGTH - 1],
			    CENTRED, 0.0, vector_temp_1);

	  gsl_blas_ddot (CENTRED, vector_temp_1, &mahalanobis);

	  if (mahalanobis < 0)
	    printf ("%f\n", mahalanobis);

	  /******ALL SHIFTS************/

	  gsl_vector_memcpy (LAST_h, h);
	  gsl_vector_memcpy (LAST_E, E);

      /***************************/

	}
      while (rl <= 999999);

PS. It simulates one milion of time series with condtional normally
distributed vectors Et.

Sorry for the mess in the code, but I have just used a part of one function.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]