This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

About FFI (Was Re: ANN: Schelab..)


On Tue, 08 Sep 1998, you wrote:

>>   After a week of hard work I'm proud to announce that the prototype for
>> Schelab, a numerical analysis library for Scheme is out. Currently only
>> simple operations on matrices, such as addition, subtraction and
>> products are implemented, but up from now it is easy to add any new
>> feature.
>>   Schelab is being developed on MzScheme, Guile 1.2 (buggy) and Gambit-C
>> 3.0.
>
>I've had a read through your docs and you have a lot of interesting and
>good ideas. I haven't got it up and running yet because I don't have
>Maroon-V3 (and I'm not familiar with how it works).

Thanks for your comments. About the installation: Meroon comes with the package
and needs no special installation for running under Guile or MzScheme. The only
messy point is that you must rebuild the DLLs yourself using a C & FORTRAN
compilers and the makefiles I supply.

>I like the way you do the tensor index as a list so that arbitrary
>dimensions of tensor can be handled. My matrix is always 2D (even if it
>is a 1D vector it is 2D) but I didn't think of the idea of using a list
>until I saw you do it. I might change mine over sometime.

I did it this way influenced by how Scilab builds hypermatrices: a list that
contains a 1d vector, plus information about its range and its dimensions.
As the kernel already handles 1d vectors extremely fast, resulting ops. are
comparable to those of 2d matrices.

>I don't quite follow the idea of `coercers'. It seems that you are
>saying that every foreign object must have a scheme equivalent.

Not necesarily so. The coercion process is that of adapting an object of one
type to a different one. From a users point of view there are two kinds of
`coercers': the family that implements Scheme <-> Meroon coercion, and the
family that implements coercion among Meroon objects themselves (including
foreign objects).

The way the library is implemented, no object is obligued to have a coercer for
all types, not even back an forth to Scheme data types: in the docs I simply say
that whenever exist a (->someobject) function that creates a foreign object up
from a scheme one, the inverse, (->scheme) should also exist.

Just to make ideas clear, `coercers' exist to

1) Simplify the creation of small objects, including foreign constants in Scheme
code. For instance
....
(define v (->fortran-double* (vector 1.0 1.5 2.0 2.5 3.0)))
....

2) Allow passing objects to foreign functions, even when they are not of the
same type but somewhat compatible: for example, thanks to

	(define-method (->fortran-double* (A real-matrix))
		(real-matrix-data A))

you can add two matrices using the BLAS DAXPY routine

	(define a (make-real-matrix 10 10 1.3))
	(define b (make-real-matrix 10 10 1.3))
	(blas-daxpy (* 10 10) 1.0 a 1 b 1) ;  a <- 1.0 * b + a

even though initially BLAS-DAXPY was declared to accept only fortran-double*.

>Now I'll admit that the scheme data types are quite versatile but
>suppose you are storing sparse matricies of the order of 10000x10000 size
>but with only 5 to 10 non-zero elements in each row. There is no problem
>storing it as a foreign object but when you use a coercer it tries to
>force it into a scheme vector and blows out the memory. You could have
>the coercer consider several possible translation options and choose the
>one it likes best but then you have foreign objects
>translating into several scheme representations. Worse, everything that
>uses a foreign object has to fiddle with several representations.
>Anyhow, I see pitfalls here without a lot of careful thinking.

As a side note full storage is not suitable even for the foreign
representation: you'll probably have the same storage problem both in Scheme
and in FORTRAN. It should be more clever to define and use a Sparse Matrix
class. This class, if you ever wish to use a Scheme representation for it,
would use just three small vectors with nonzero objects. You can build sparse
matrices with the tools in the Schelab library, though it requires some coding
and I still had no time to do it. For example a prototype of how it would go:

(define-class sparse-matrix matrix ((= rows) (= columns) (= i) (=j) (=data))
(define (range start end)
	(define len (- end start))
	(define v (make-vector len 0))
	(do ((n 0 (+ n 1))
		((= n len) v)
	(vector-set! v n (+ n start))))
(define (speye matrix-size)
	(define i (range 0 (- matrix-size 1)))
	(define j (range 0 (- matrix-size 1)))
	(instantiate sparse-matrix
		:rows matrix-size :columns matrix-size
		:i (range 1 matrix-size)
		:j (range 1 matrix-size)
		:data (make-fortran-double* matrix-size 1.0)))

This code defines a function that returns identity matrix in sparse format
(coordinates method). Don't take it too seriously because it's not optimized:
eventually, I will use a fortran-integer* to store the numbers so that FORTRAN
code can access the data. OTOH, it shows that Scheme has enough power to
hold complex data structures (You could create a sparse matrix just with
uniform vectors), even though the foreign representations are somewhat more
convenient for exchanging data with other languages.

>Another thing that I like is the idea of doing all FORTRAN foreigns
>through a thing called fortran-call and then using a scheme declaration
>to generate wrappers in scheme. It results in a very neat way of handling
>foreign functions, especially the way it does type checks in scheme
>but is a problem regarding efficiency: You convert everything into pointers
>to foreign data which requires that you malloc many small blocks for temporary
>data and then free them after each call. Wrappers written in C will do this
>on the stack which is much quicker (and doesn't depend on the garbage
>collector to clean up the temporary space). Maybe this is not such a big
>issue but the general idea is to avoid generating garbage where possible.

My experience shows that GC is not such a problem. I've seen no real
performance penalty due to it. Instead, it is the use of an interpreted Meroon
library what takes the bytes and the time when in Guile: every conversion must
do a check for the type of the object and this check in Guile is not cheap.
Take this piece of code

(time (do ((i 0 (+ i 1)))
	((= i 1000))
	(->Pointer (->fortran-integer a))))

which under Guile takes 568 milliseconds to execute, while it takes just 70 in
Gambit (of which 20 are GC I must admit). (A PII/300, 128Mb RAM but quite
loaded). The time difference is simply due to Meroon being compiled rather than
interpreted.

About glue code, I must recognize that I deeply dislike it, though I see that
it is the right thing when looking for top performance and integration with
other pieces of the O.S. But wouldn't it be possible to do it without pushing
the interpreter far with smobs? I think there's an alternative which is:

1) Creating a portable OO system for Scheme --perhaps an improved version of
Meroon--, and get it to compile as a DLL.

2) Create classes to let Scheme manipulate objects in the foreign
representation: let's call them foreign types. Up from atomic objects, such as
C-int, C-double, C-char, etc, this should be feasible and not very difficult.

3) Create a macro facility to generate Scheme wrappers. These wrappers should
accept any kind of Scheme objects, perform type checks and conversions when
possible and finally call the C routine which can be

a) A glue routine such as those generated by GWRAP, though extended to know
about the foreign type classes. In any case, these routines should be no longer
than a single line, as checks have been made already in Scheme.

b) A sort of `C-call' or `fortran-call' for those cases where the function can
be called directly (as when it gets parameters by reference, or just pointers
and integers). As an example, this is the general purpose function I use to call
fortran subroutines:

----
SCM fortran_call(SCM address, SCM args) {
#define MAX_PARAMETERS 16
void *buffer[MAX_PARAMETERS];
fortran_subroutine f;
SCM first;
int n = 0, i;

n = gh_list_length(args);

if (n > MAX_PARAMETERS) {
	ERROR("fortran-call: too many parameters\n");
	return SCM_BOOL_F;
}

for (i = 0; i < n; i++) {
	first = gh_car(args);
	if (!SCM_PTRP(first) && !SCM_BLOCKP(first)) {
		ERROR("fortran-call: argument not a pointer\n");
		return SCM_BOOL_F;
	}
	buffer[i] = SCM_2_PTR(first);
	args = gh_cdr(args);
}
for (i = n; i < MAX_PARAMETERS; i++)
	buffer[i] = NULL;

/* 'f' represents a dyn. linked symbol and is converted to a pointer */
f = gh_scm2ulong(address);
SCM_DEFER_INTS;
if (n <= 8) {
	(*f)(buffer[0], buffer[1], buffer[2], buffer[3],
		buffer[4], buffer[5], buffer[6], buffer[7]);
} else {
	(*f)(buffer[0], buffer[1], buffer[2], buffer[3],
		buffer[4], buffer[5], buffer[6], buffer[7],
		buffer[8], buffer[9], buffer[10], buffer[11],
		buffer[12], buffer[13], buffer[14], buffer[15]);
}
SCM_ALLOW_INTS;
return SCM_BOOL_T;
}
----

>On the subject, the way that C uses the stack for local variables
>is a very powerful and efficient system. In guile the local variables
>within a let or a function are stored in much the same way as other
>variables which implies that every time you enter a (let) and have
>a bunch of variables declared, you generate a bunch of garbage even
>though you know that they are only temporary variables.
>
>Very clever use of recursion and `functional programming' often
>eliminates the need for let but leads to some mind-bending code
>(and still the function argument bindings will surely generate garbage too).
>Is there a way for scheme to make more effective use of the stack?
>Would this spoil the clean structure of the language?
>Even if it introduced the same hole that C suffers (i.e. return()ing
>a pointer to the stack and having the contents clobbered)
>I for one would be willing to wear that.
>
>Have I got the right idea of what is going on here or has this issue
>already been taken care of?

Well, I don't know how Guile does handles this, but I bet there are
implementations of Scheme out there that compile functions to a sort of machine
code and do use the sort of techninque you mention -- stacks. Indeed, though
I'm not sure about this point, I think that should be the right stuff: Scheme
is statically scoped, as C, and doing
	(define b 0.3)
	(let ((a (cos b)))
	...)
should imply, in byte compiled code, no more overhead than C's
	double b = 0.3;
	{
	double a = cos(b);
	...
	}

Regards
	Juanjo

BTW: there must be a hundred of interpreters out there, most of which use glue
code. Now the question is, wouldn't it be worth creating a routine to handle
the program stack and call arbitrary routines? The assembly code would probably
be less than 30 lines, and a lot of people would benefit (even those from Tcl,
Perl or Python I believe), but I see no such a thing anywhere :( Is it so
difficult?

--
Juan Jose Garcia Ripoll          job: jjgarcia@ind-cr.uclm.es
Dpto. de Matematicas             home: worm@arrakis.es
E.T.S.I. Industriales            www: http://www.arrakis.es/~worm
Univ. de Castilla-La Mancha
Ciudad Real (Spain)