This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))

To: per at bothner dot com
Subject: Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
From: Gary Houston <ghouston at arglist dot com>
Date: 17 Feb 2000 00:20:29 -0000
CC: guile at sourceware dot cygnus dot com
References: <20000212132515.712.qmail@d231-122.dial.mistral.co.uk> <20000212171420.A1179@206.31.63.15> <20000214215419.833.qmail@d231-209.dial.mistral.co.uk> <20000215101916M.1000@eccosys.com> <m2900m2t33.fsf@kelso.bothner.com> <20000216212746.679.qmail@d231-87.dial.mistral.co.uk> <m2ln4kx0qh.fsf@kelso.bothner.com>

> From: Per Bothner <per@bothner.com>
> Date: 16 Feb 2000 14:04:54 -0800
> 
> > I'm not sure I understand this proposal completely, since I don't see
> > what you gain by using two ports.
> 
> No, two (rather four) port *types*.

Do you think combined input/output ports are more trouble than they're
worth?

> > Wouldn't it be confusing to work
> > with, e.g., if you were reading a stream of arbitrary data, would you
> > read from one port some of the time to unpack bytes into Scheme and
> > then from the other whenever you expected a character?
> 
> I don't think you can meaningfully or reliably do that.  You either
> process a sequence of bytes or you process a sequence of characters.

It seems a bit restrictive to allow only meaningful and reliable
formats.  Examples would be things like reading a binary database
record with string fields or decoding network protocols (I'm not sure
which ones off hand.  Doesn't HTTP start with an ASCII header and
switch to a character set specified in the header?)

> > It seems to me easier to consider an input port to be a source of
> > bytes, with read-char a procedure for unpacking bytes into characters.
> 
> How about peek-char, read, read-line, etc?  What about display, write,
> format?

Your system could make read and read-line simpler or more efficient, I
think, by allowing them to scan the buffer without needing to decode
the bytes.

> Basically, all standard Scheme procdures work with characters,
> not bytes, so an input port *is* a sequence of characters.
> You can add extra procedures that read the underlying bytes
> but you will find that buffering and character conversion
> make that problematical.

Once you place decoded characters in the buffers instead of bytes?

> > To support multiple encodings, the port could have a "current
> > encoding" which could be changed at will (actually this is just to
> > avoid adding an extra incompatible argument to read-char.
> 
> Can't do that in general.  Some encodings are "stateful".  I guess
> you can reset the decoding state when you switch encodings.  If you
> do that for output, you'll produce a meaningless document.

Maybe not in general, it would be up to the user not to mess it up.
Banning it completely seems like overkill.

> > An alternative would be to let read-char default to a global locale
> > setting and add read-char/charset or something to specify variations.)
> 
> Yes, that is the C approach.  It is of course the wrong way to do it.
> (It doesn't work with threads - or clean programming practices.)
> 
> > Individual characters are only part of the problem anyway: there's
> > also the custom of treating strings as byte arrays that would break.
> 
> Assuming the size of character remains at least 8 bits (i.e.
> integer->char and char->integer are well defined for at least
> the range 0 .. 255), I don't see where the breakage would come in.

I was thinking of where strings are passed to various system call and
gh_ interfaces, so reading a string (of arbitrary bytes) with read-line
and writing it to the interface would end up modifying the bytes.

Follow-Ups:
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Per Bothner

References:
- Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)
  - From: Gary Houston
- Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)
  - From: C. Ray C.
- Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)
  - From: Gary Houston
- binary-io, opposable-thumb, pack/unpack (was Re: binary-io (wasRe: rfc 2045 base64 encoding/decoding module))
  - From: sen_ml
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Per Bothner
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Gary Houston
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Per Bothner

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]