This is the mail archive of the
guile@sourceware.cygnus.com
mailing list for the Guile project.
Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
> From: Per Bothner <per@bothner.com>
> Date: 15 Feb 2000 10:56:00 -0800
>
> sen_ml@eccosys.com writes:
>
> > -ports are defined in terms of chars, and chars might not be
> > fixed in width to 8-bits in the future.
>
> The way I think it should work (at least this is what I'm doing in Kawa):
>
> You need four kinds of (generic) ports:
>
> byte-input-port: Reads a sequence of 8-bit bytes
> char-input-port: Reads a sequence of (wide) characters (e.g. Unicode).
> byte-output-port: Writes a sequence of 8-bit bytes
> char-output-port: Writes a sequence of (wide) characters (e.g. Unicode).
>
> For compatibility and convenience, you want a procedure like read-char
> to accept either a byte-input-port or a char-input-port. If the
> specified port is a char-input-port, the result should be a character,
> as if returns by (char->integer BYTE). Similarly, write-char
> works on both byte-output-port and char-output-port.
I'm not sure I understand this proposal completely, since I don't see
what you gain by using two ports. Wouldn't it be confusing to work
with, e.g., if you were reading a stream of arbitrary data, would you
read from one port some of the time to unpack bytes into Scheme and
then from the other whenever you expected a character?
It seems to me easier to consider an input port to be a source of
bytes, with read-char a procedure for unpacking bytes into characters.
To support multiple encodings, the port could have a "current
encoding" which could be changed at will (actually this is just to
avoid adding an extra incompatible argument to read-char. An
alternative would be to let read-char default to a global locale
setting and add read-char/charset or something to specify variations.)
Individual characters are only part of the problem anyway: there's
also the custom of treating strings as byte arrays that would break.