This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: SAXON and UTF-8


> Newbie observations: I get the following error when feeding
> SAXON with a XML document with UTF-8 encoding.
>
> --
> E:\test\sampledocs>saxon dataseq.xml sampledoc.xsl > dataseq.fo
> Fatal error reported by XML parser: required character (found
> "?") (expected
> "<"
> )
>   URL:    file:/E:/test/sampledocs/dataseq.xml
>   Line:   1
>   Column: 5
> Error
>   required character (found "?") (expected "<")

This message suggests that there's no problem with your UTF-8, but there is
a problem with your XML. Without seeing the file, I can't tell you what the
problem is.

> Saving in "plain text" triggers the appropriate error message
> from SAXON:
>
> E:\test\sampledocs>saxon dataseq.xml sampledoc.xsl > dataseq.fo
> Fatal error reported by XML parser: bad continuation of
> multi-byte UTF-8 sequence (character code: 0x72)
>   URL:    file:/E:/test/sampledocs/dataseq.xml
>   Line:   -1
>   Column: 1477
> Error
>   bad continuation of multi-byte UTF-8 sequence (character code: 0x72)
> Transformation failed
>
> That error message could have been better.

Yes. AElfred first tries to decode a buffer-full of bytes into characters,
and then looks for the newline characters that determine line endings. If it
fails in the first step, then the line number is -1 and the column number is
the byte offset where it hit trouble. In general, if the file is not in the
expected encoding then line boundaries will not be detected correctly.

Mike Kay
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]