This is the mail archive of the guile@cygnus.com mailing list for the guile project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: mbstrings

To: Guile Mailing List <guile@cygnus.com>
Subject: Re: mbstrings
From: Per Bothner <bothner@cygnus.com>
Date: Thu, 16 Oct 1997 22:48:33 -0700

Jens-Ulrik Holger Petersen <petersen@kurims.kyoto-u.ac.jp> writes:
> I am in favour of the idea of using Unicode, but just for the sake of
> completeness I would like to mention that the XEmacs-20 implementation
> of Mule does use characters (unlike Emacs-20 integer implementation).

Well - not really.

XEmacs does have a 19-bit character encoding based on Mule.
The problem is that this encoding is really a two-dimensional
encoding, consisting of two parts:  A code to specify the "character
encoding" combined with "position in that encoding".  The problem
is that many characters are common to many character encodings.
To compare characters for equality becomes a philosophical problem:

1) Do you just compare the character codes, ignoring that the
conceptually same character may be encoded many ways?

2) Do you canonicalize the characters so that equivalent characters
are equal?

2a) Do you do the canonicalization when you do the comparison, or

2b) when the character is created (e.g. read or stored in a string).

Of course - once you do canonicalization, you might as well use
Unicode.

In other words:  The Mule "characters" are problematical as
characters in the Scheme sense.

	--Per Bothner
Cygnus Solutions     bothner@cygnus.com     http://www.cygnus.com/~bothner

Follow-Ups:
- Re: mbstrings
  - From: NIIBE Yutaka <gniibe@etl.go.jp>

References:
- Re: mbstrings
  - From: Jens-Ulrik Holger Petersen <petersen@kurims.kyoto-u.ac.jp>

Prev by Date: Re: linking guile-ffi into interpreter
Next by Date: Re: mbstrings
Prev by thread: Re: mbstrings
Next by thread: Re: mbstrings
Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]