This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)


On Thu, Feb 10, 2000 at 11:52:24PM -0000, Gary Houston wrote:
> It's not a good idea to use read-char and write-char to process
> arbitrary bytes.  It would probably break if multibyte character
> support was implemented (or alternatively, make multibyte character
> set support harder to implement and less elegant: just take a look at
> C.)
> 
> Unfortunately the alternatives aren't very good at present.  R5RS
> doesn't provide anything.

R5RS specifies as little as possible (to the point of not giving a way to
embed newlines in strings!), and Guile implies that strings and characters
should be used to store binary data. E.g. the docs for gh_scm2newstr()
say, "Note that Scheme strings may contain arbitrary data, including
null characters".

Because it's the only way to deal with binary data, at the moment I
use characters and strings for precisely that purpose. Like you say,
this may collide with multibyte characters in the future.

To implement multibyte character support, we either need a new multibyte
character type, or we need a new single-byte type. A new single-byte type
certainly makes more sense just from a vocabulary standpoint (characters
would be characters, and bytes would be bytes).

> Do not write procedures that pack/unpack data directly from a port,
> since they would also be useful as part of a foreign function interface
> and probably for things like mmap.  Constructing a port has a certain
> overhead and imposes a serial interface.

I can't see the utility of having a non-serial interface. The only
things you need this for are accessing foreign data, which come from
files and network connections, no? Foreign function interfaces are
generally written in C, and mmap in Scheme... well...

> Define a new "byte-vector" type.
> [...deleted discussion of new types...]
> One could also address the entire memory by creating a shared
> byte-vector at address 0 and length equal to the address space, thus
> giving Guile a "peek" and "poke" facility.  Hurrah!
> 
> It may also be useful to make port buffers visible as shared
> byte-vectors.
> 
> Then write unpack/pack routines which operate on byte-vectors, including
> conversion to and from Scheme chars and strings.

This strikes me as needlessly complicated. The only purpose it would
serve is to support things like mmap (maybe for instance with video
frame buffers?). It just seems to me that it gives up one of the major
benefits of Scheme -- the ability to not worry about the details of
memory management or variable types.

I would prefer simply a new type, called "byte", an 8-bit numeric type,
and then "read-byte", "write-byte", "byte->integer", "integer->byte"
and "byte?" procedures. This is sufficient to do anything I can think
of with external data.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]