This is the mail archive of the gsl-discuss@sourceware.cygnus.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

The Plan



Brian and I had some discussion this week about The Big Picture.
The issues are the following:

	1) The role of GSL is unclear. Possibilites are:
            - GSL is a free equivalent of Numerical Recipes
            - GSL is a high quality toolbox for incorporation
              into high-level scientific software (i.e. wrapped
              in products like Octave, or behind scripting interfaces)
            - GSL is a high quality toolbox for scientific end-users
	    - GSL is mainly about converting legacy Fortran code
	      and putting the result under the GPL.
	   Obviously there is some truth in each of these.

	2) With regard to the last characterization above, our
	   experience has been that it is very misleading. The
	   idea that "well, the basic code exists on Netlib, we
	   just have to convert it" is completely off the mark
	   because most of the effort does not go into numerical
	   algorithms, but into overall design issues. Fortranitis
	   is a lot more deep-rooted than a simple urge to index
	   arrays starting at 1. Since most of the work goes into
	   overall design, it benefits from a fresh approach.

	3) If the scientific end-user is the main target, then C
           is inappropriate, and we should concentrate on a C++ product.
           The idea that the C backend could be wrapped in C++
	   is unacceptable since the design does not benefit from
	   important C++ features. We feel that the decision to
	   write in C was made in the remote past, and since that time
	   many of the thorny issues with C++ have disappeared;
	   C++ is actually useful now, and not just an entertaining
	   topic for conversation and witticism.

	4) Linear algebra is a real problem. Historically the
	   design problem of mapping the linear algebra world
	   onto a software system has not been solved. There
	   are many Fortran and C linear algebra packages out there,
	   and none of them achieve a simultaneously efficient 
	   and high-level representation of the problem domain.
	   We feel that the best hope for the future lies with
	   current C++ work; gimmicks like expression templates
	   have changed the playing field. Therefore, any scientific
	   library must be prepared to mesh well with the soon
	   to arrive C++ solutions for the linear algebra world.

	5) On the same subject, we do not feel it is possible
	   for us to create a high quality linear algebra
	   toolbox, in any language. The effort required
	   is too high in such a large and specialized world,
	   where the basic issue of mapping the problem domain
	   onto a software design is still an open research problem.
	   Therefore, we are forced to prepare ourselves to accept
	   the best-effort solution of the linear algebra folks.
	   Surveying this area, it is obvious that the current
	   round of development in this area is all C++ directed
	   (with a smattering of Java (!)).

	6) Scripting interfaces to C++ backends have become
	   commonplace, and the necessary tools exist. Not only
	   Swig, but other projects in development, like
	   Siloon. Therefore  wrapping something like GSL 
	   has become a language-neutral issue. So the only
	   remaining question from this standpoint is the
	   question of lightweight-ness. We consider this to
	   be a serious issue only for embedded applications.
	   Further, we feel that the general fear of C++ bloat
	   is unfounded; good designs are bloat-free.


You can see where this is heading. However, before we start
squabbling, it is worth considering the following level-headed
analysis:

	a) If GSL had started as a C++ project, it would have
	   tanked like 90% of all numerical C++ projects.
           We would still be arguing about how to do 
	   linear algebra, and everything else would
	   have come to a standstill.

	b) GSL works.

	c) The investment in the code is large.
	   The investment in the design (especially the changes
	   that Brian and I have made in the last 2 years) is larger.


Nevertheless, on many days we feel like the current C code base
is as much a hindrance to design as anything. We have no love for it.
If one entertains the idea that the current GSL code is simply
a learning experience, then all the design investment is saved.
Design investment is the most valuable asset in any project like this.


All this babbling leads us to the following question.
When do we declare ourselves done and the GSL C code base as frozen?

If we felt that we were creating the best possible technology
with the right tools, then having an open-ended project would not
be a problem. It would simply grow, from young swan to middle-aged swan,
to old and venerable swan.

However, I think what we have here is not a swan but a chicken.
Useful and tasty, but nothing to get too excited about.

Therefore the open-ended nature of the project is unacceptable.
We are ready to move onto a braver and newer world. Therefore
we have to define fixed and attainable goals for this code base.

Realistically we have come to the following conclusion. Most
scientific end-users define "numerical computing" to be the
content of Numerical Recipes. Sad, yes, but that is another
argument for another day. We take this operational definition,
examine the table of contents, and conclude that the following
are the remaining targets. I have indicated the responsible
parties in brackets.

	Priority 1)
		- multidimensional root finding [Brian]
		- multidimensional optimization [Brian]
		- ODEs, initial value [Jerry]
		- somewhat expanded linear algebra
		  (tridiag, symmetric, hermitian systems) [Jerry]

	Priority 2)
		- sin, cos transforms [Brian]
		- statistical tests [Brian ?]

	Priority 3)
		- ODEs, boundary value [Jerry ?]
		- integral equations [Jerry ?]
		- least squares and other estimation
		  subjects, including robust methods [Brian] 


And that's it. When these are finished we will have reached
a tangible milestone. I would move at that point to consider
freezing the GSL C code base. I would certainly be more interested
in working on a C++ product, reusing all the appropriate 
design elements from the C code.

Conservatively, and roughly, each of these elements should
take about a man-month. I have already, in the last week,
finished the first pass at the ODE initial value code.
Given that estimate, a concerted effort would have us
finishing within 6 months. 

I think the main psychological difficulty we have to combat
is the feeling that stopping and freezing the code is
a kind of failure. Nothing could be farther from the truth;
at that point we will have a manifestly broad and high-quality
set of tools. We will never be able to get everything in and
please everybody, and the main point I want to make here
is that I do not think we should try with the current code base.

That's all folks.


-- 
G. Jungman

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]