This is the mail archive of the gsl-discuss@sourceware.cygnus.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: multidimensional optimization


Brian Gough wrote:
> 
> The starting point is to list and classify all the existing
> techniques, and determine an appropriate API for each class of
> problem.
> 
> (There is a nice little summary of minimisation techniques in the book
> "Numerical Computation (Volume 2)" by Ueberhuber.)

Ok.

My main goal is to implement gradient based descent techniques. These
algorithms use the following very simple algorithm:

- initialise: initial guess for the minimum
- loop:
	- compute the gradient of the function at the current guess
	- if available, compute a new descent direction thanks to the previous one
and the gradient. If not, use the gradient as descent direction
	- call a one dimensionnal minimizer to optimize the function in the descent
direction 
- test for convergence (and keep on looping a minimum is not reached)

> Each API should aim to support a variety of different interchangeable
> algorithms within the problem class. For iterative algorithms we have
> adopted an initialise-iterate-test structure by convention.
> 
> The 1-d root finding and minimisation directories give the general
> idea.

I've read the sources of the min package. There are some differences between
one dimensional minimisation and multidimensional one. There are also some
problems.

1) convergence test is trickier
In general (for Neural Networks application) we test for numerical nullity of
the gradient. This means the gradient must be know and therefore that the
convergence test prepares part of the next iteration.
We can also use a last move test.

2) restarting
Conjugate gradient methods must be restarted to work properly (the descent
direction must be set to the gradient from time to time). Systematic
restarting is not a good idea in general, that's why some algorithms compute a
last step size to trigger restarting, or sometimes other technics.
Unfortunately, restarting triggering test overlaps in general with convergence
test.

3) bracketing for the one dimensional algorithm
When we have a descent direction d and an current guess x, we must optimize
the one dimensional function g(e)=f(x+ed). Basically, we don't have an initial
bracketing of the minimum (as requiered by the golden section for instance).
In general, we find it by trying a fixed value for e. By comparing g(e) with
the already known g(0), we know if we need to find a smaller value e'\in]0,e[
such that g(e')<g(e) and g(e')<g(0) or a bigger value. 

This part can be tricky and can fail. Once again, failure can tell that
restarting should be a good idea or that stopping is also a good idea.

4) driving the one dimensional algorithm
Experiments show that for a broad class of problems (including NN), it is not
such a good idea to find _the_ minimum of the g function, because this will
need to much iterations. In general, it's enough to localise the minimum
roughly. 

What is a little bit boring, is that depending on the plugins you use (for
instance a restarting triggering plugin, a convergence test plugin, etc.) you
will compute differente things. If you compute for instance the gradient in
the convergence test, then you should not recompute it in the next iteration
(this can be very expensive!). But if you really separate each part of the
algorithm from the others, you'll have to use flags or something similar to
tell to other plugins what you have computed.

I stop here, because I'm wondering if I'm not too close to my old code to see
simplification or another way to do thing. I need some feedback!

Fabrice Rossi

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]