This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))


Gary Houston <ghouston@arglist.com> writes:

> I'm not sure I understand this proposal completely, since I don't see
> what you gain by using two ports.

No, two (rather four) port *types*.

> Wouldn't it be confusing to work
> with, e.g., if you were reading a stream of arbitrary data, would you
> read from one port some of the time to unpack bytes into Scheme and
> then from the other whenever you expected a character?

I don't think you can meaningfully or reliably do that.  You either
process a sequence of bytes or you process a sequence of characters.

> It seems to me easier to consider an input port to be a source of
> bytes, with read-char a procedure for unpacking bytes into characters.

How about peek-char, read, read-line, etc?  What about display, write,
format?

Basically, all standard Scheme procdures work with characters,
not bytes, so an input port *is* a sequence of characters.
You can add extra procedures that read the underlying bytes
but you will find that buffering and character conversion
make that problematical.

> To support multiple encodings, the port could have a "current
> encoding" which could be changed at will (actually this is just to
> avoid adding an extra incompatible argument to read-char.

Can't do that in general.  Some encodings are "stateful".  I guess
you can reset the decoding state when you switch encodings.  If you
do that for output, you'll produce a meaningless document.

> An alternative would be to let read-char default to a global locale
> setting and add read-char/charset or something to specify variations.)

Yes, that is the C approach.  It is of course the wrong way to do it.
(It doesn't work with threads - or clean programming practices.)

> Individual characters are only part of the problem anyway: there's
> also the custom of treating strings as byte arrays that would break.

Assuming the size of character remains at least 8 bits (i.e.
integer->char and char->integer are well defined for at least
the range 0 .. 255), I don't see where the breakage would come in.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]