This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Special entity characters in Shift-JIS XSL.


Is there a way in which I can specify UTF-8 encoding and output a ASCII
sequence. I should be able to see the file in any text editor so can I code
all the characters as &#nnnn??

Kiran

-----Original Message-----
From: David Carlisle [mailto:davidc@nag.co.uk]
Sent: Friday, December 17, 1999 2:05 AM
To: xsl-list@mulberrytech.com
Subject: Re: Special entity characters in Shift-JIS XSL.




> I think the OPPOSITE of flaky is the word I would use to describe an
entity
> identification paradigm that allows the entity to remain in its encoded
> form, yet still be identified as an entity.  I think solid is more the
word.

You could build a solid system on that basis, but it wouldn't be XML.

> how can it then be passed to anymore parsers expecting 7-bit ASCII
> characters?  

XML character set is _always_ unicode. If the encoding isn't the default
utf8 or utf16 not all of the character set may be directly accessed by
character data, but you can always use the &# syntax to access any
unicode character. An XML parser _has_ to treat `A' and `A' in an
identical manner and report `character number 65' to the application,
whichever version was in the input file. If your application _needs_
to see `A' and not `A' then it isn't an XML application (it could be
an SGML one).

> What if each of those parsers followed the spec, the first
> transforming the character into a 2-byte unicode character, leaving the
> others to see the two bytes as simply two different characters in the
> stream?

This can't happen as in a well formed XML document you _always_ know
if a multi-byte encoding is being used. Eitehr the <?xml declaration
specifies a single byte encoding such as latin 1, or a multiple byte
encoding is being used (utf 8 unless the first two bytes of the file are
the BOM, in which case it's utf-16)

David
 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]