This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: MSXML - Processing non standard characters


On Thursday, August 02, 2001 5:41 AM
Warren Keane wrote:

[..]

> the offending character is the "1/2" at column 386. Is
> "ISO-8859-1" the proper encoding to remedy this situation?

Declaring the encoding of that and similar documents to be ISO-8859-1 is
indeed what you need. But note that all you are doing there is telling
the processor what the encoding (already) is. You aren't altering the
encoding itself. Without that encoding declaration, the processor
assumes utf-8 and so cannot parse the file, because a stand-alone
decimal 189 (the representation of VULGAR FRACTION ONE HALF in
ISO-8859-1) is an illegal value in a utf-8 encoded document. The
encoding of that same abstract character in utf-8 would be the two byte
sequence (decimal ) 194, 189. This bewilders people who try to read
utf-8 as IS-8859-1, because they apparently have got their "half"
showing OK but they wonder where the "garbage" capital A circumflex in
front of it has come from (and generally ask here about it) I mention
this in case you accidentally end up with utf-8 output and get bitten
the other end as it were.

Michael
---------------------------------------------------------
Michael Beddow   http://www.mbeddow.net/
XML and the Humanities page:  http://xml.lexilog.org.uk/
---------------------------------------------------------



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]