This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
UTF-8 Output
- To: xsl-list at lists dot mulberrytech dot com
- Subject: [xsl] UTF-8 Output
- From: Jim Schmidt <JSchmidt at yet2 dot com>
- Date: Thu, 10 May 2001 14:09:14 -0700
- Reply-To: xsl-list at lists dot mulberrytech dot com
I have a style sheet that is used for outputing an XML document to HTML.
Some of the XML elements contain HTML so I output them with the following
style sheet which I include from the document style sheet.
<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0" >
<xsl:template match=" a | applet | b | big | body | br | caption |
cite | code | col | colgroup | dd | div | dl | dt | em | font |
form | frame | frameset | head | h1 | h2 | h3 | h4 | h5 | h6 | hr |
html | i | iframe | link | li | map | meta | noframes | ol |
p | param | pre | s | script | small | span | strong | style |
sub | sup | td | th | tr | tt | ul | var | table ">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The document style sheet includes the output tag shown below:
<xsl:output method="html" indent="yes" xalan:indent-amount="4"
encoding="UTF-8" />
The XML document also defines its encoding as UTF-8.
Everything works very well except Unicode characters. I am using Xalan 2.0.
When I look at the XSL trace the Unicode characters are correct but when I
look at the HTML source some of the Unicode bytes have been converted to
HTML entities. As a result Unicode characters are not displayed correctly in
a browser. If I change the entities in the HTML back to the proper
characters the page displays correctly.
What am I doing wrong? Should I be using &#nnnn; in the original XML? Or
should Xalan be able to output Unicode but my template is wrong? I have read
the FAQ and archive regarding UTF-8 encoding but can't seem to get it right.
Jim
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list