This is the mail archive of the
gsl-discuss@sourceware.cygnus.com
mailing list for the GSL project.
The Plan
- To: GSL discussion list <gsl-discuss@sourceware.cygnus.com>
- Subject: The Plan
- From: Gerard Jungman <jungman@lanl.gov>
- Date: Fri, 20 Aug 1999 16:58:36 -0600
- Organization: LANL T-8
Brian and I had some discussion this week about The Big Picture.
The issues are the following:
1) The role of GSL is unclear. Possibilites are:
- GSL is a free equivalent of Numerical Recipes
- GSL is a high quality toolbox for incorporation
into high-level scientific software (i.e. wrapped
in products like Octave, or behind scripting interfaces)
- GSL is a high quality toolbox for scientific end-users
- GSL is mainly about converting legacy Fortran code
and putting the result under the GPL.
Obviously there is some truth in each of these.
2) With regard to the last characterization above, our
experience has been that it is very misleading. The
idea that "well, the basic code exists on Netlib, we
just have to convert it" is completely off the mark
because most of the effort does not go into numerical
algorithms, but into overall design issues. Fortranitis
is a lot more deep-rooted than a simple urge to index
arrays starting at 1. Since most of the work goes into
overall design, it benefits from a fresh approach.
3) If the scientific end-user is the main target, then C
is inappropriate, and we should concentrate on a C++ product.
The idea that the C backend could be wrapped in C++
is unacceptable since the design does not benefit from
important C++ features. We feel that the decision to
write in C was made in the remote past, and since that time
many of the thorny issues with C++ have disappeared;
C++ is actually useful now, and not just an entertaining
topic for conversation and witticism.
4) Linear algebra is a real problem. Historically the
design problem of mapping the linear algebra world
onto a software system has not been solved. There
are many Fortran and C linear algebra packages out there,
and none of them achieve a simultaneously efficient
and high-level representation of the problem domain.
We feel that the best hope for the future lies with
current C++ work; gimmicks like expression templates
have changed the playing field. Therefore, any scientific
library must be prepared to mesh well with the soon
to arrive C++ solutions for the linear algebra world.
5) On the same subject, we do not feel it is possible
for us to create a high quality linear algebra
toolbox, in any language. The effort required
is too high in such a large and specialized world,
where the basic issue of mapping the problem domain
onto a software design is still an open research problem.
Therefore, we are forced to prepare ourselves to accept
the best-effort solution of the linear algebra folks.
Surveying this area, it is obvious that the current
round of development in this area is all C++ directed
(with a smattering of Java (!)).
6) Scripting interfaces to C++ backends have become
commonplace, and the necessary tools exist. Not only
Swig, but other projects in development, like
Siloon. Therefore wrapping something like GSL
has become a language-neutral issue. So the only
remaining question from this standpoint is the
question of lightweight-ness. We consider this to
be a serious issue only for embedded applications.
Further, we feel that the general fear of C++ bloat
is unfounded; good designs are bloat-free.
You can see where this is heading. However, before we start
squabbling, it is worth considering the following level-headed
analysis:
a) If GSL had started as a C++ project, it would have
tanked like 90% of all numerical C++ projects.
We would still be arguing about how to do
linear algebra, and everything else would
have come to a standstill.
b) GSL works.
c) The investment in the code is large.
The investment in the design (especially the changes
that Brian and I have made in the last 2 years) is larger.
Nevertheless, on many days we feel like the current C code base
is as much a hindrance to design as anything. We have no love for it.
If one entertains the idea that the current GSL code is simply
a learning experience, then all the design investment is saved.
Design investment is the most valuable asset in any project like this.
All this babbling leads us to the following question.
When do we declare ourselves done and the GSL C code base as frozen?
If we felt that we were creating the best possible technology
with the right tools, then having an open-ended project would not
be a problem. It would simply grow, from young swan to middle-aged swan,
to old and venerable swan.
However, I think what we have here is not a swan but a chicken.
Useful and tasty, but nothing to get too excited about.
Therefore the open-ended nature of the project is unacceptable.
We are ready to move onto a braver and newer world. Therefore
we have to define fixed and attainable goals for this code base.
Realistically we have come to the following conclusion. Most
scientific end-users define "numerical computing" to be the
content of Numerical Recipes. Sad, yes, but that is another
argument for another day. We take this operational definition,
examine the table of contents, and conclude that the following
are the remaining targets. I have indicated the responsible
parties in brackets.
Priority 1)
- multidimensional root finding [Brian]
- multidimensional optimization [Brian]
- ODEs, initial value [Jerry]
- somewhat expanded linear algebra
(tridiag, symmetric, hermitian systems) [Jerry]
Priority 2)
- sin, cos transforms [Brian]
- statistical tests [Brian ?]
Priority 3)
- ODEs, boundary value [Jerry ?]
- integral equations [Jerry ?]
- least squares and other estimation
subjects, including robust methods [Brian]
And that's it. When these are finished we will have reached
a tangible milestone. I would move at that point to consider
freezing the GSL C code base. I would certainly be more interested
in working on a C++ product, reusing all the appropriate
design elements from the C code.
Conservatively, and roughly, each of these elements should
take about a man-month. I have already, in the last week,
finished the first pass at the ODE initial value code.
Given that estimate, a concerted effort would have us
finishing within 6 months.
I think the main psychological difficulty we have to combat
is the feeling that stopping and freezing the code is
a kind of failure. Nothing could be farther from the truth;
at that point we will have a manifestly broad and high-quality
set of tools. We will never be able to get everything in and
please everybody, and the main point I want to make here
is that I do not think we should try with the current code base.
That's all folks.
--
G. Jungman