This is the mail archive of the
kawa@sources.redhat.com
mailing list for the Kawa project.
XMLParser Error
- From: lucian <lucian at mediafusion dot co dot jp>
- To: kawa at sources dot redhat dot com
- Date: Sat, 12 Apr 2003 11:20:26 +0900
- Subject: XMLParser Error
Hello !
While calling
TreeList doc = new TreeList();
XMLParser parser
= new XMLParser(new LineBufferedReader(new java.io.StringReader(document)),
(Consumer)new NamespaceResolver(doc), messages);
doc.beginDocument();
parser.parse();
<unknown>:1:3: missing name
Exception in thread "main" gnu.text.SyntaxException
is thrown .
document is a String which contains a simple XML.
The problem is that it contains japanese characters(kaji).
Something like
<name>ONE_KANJI</name>
is parsed without errors but
<ONE_KANJI>Test</ONE_KANJI>
gives the above error as if it would represent an invalid XML.
But the second String is an valid XML and can be parsed with javax.sax.DocumentBuilder.
I`m attaching a simple test case.It has been saved as EUC-JP(with emacs on a debian box) so compiling may need -encoding EUC-JP.For correct kanji display kterm may be used (I think
it`s available on any linux distribution.
If compiling failes I can provide a complied class if it would help.
This is a tiny part of project that is supposed to be used in Japan so japanese is a must.
I`ve got some good feedback from this list in the past so I dear to ask for a bit of your time
again.
Regards.
--
Lucian
import gnu.xml.NamespaceResolver;
import gnu.lists.*;
import gnu.text.*;
import gnu.kawa.xml.*;
public class Test {
public static void main(String [] args) throws java.lang.Throwable,java.io.IOException,gnu.text.SyntaxException{
String xml1 = "<t>ÅÄ</t>";
String xml2 = "<ÅÄ>t</ÅÄ>";
System.out.println("1:"+xml1 );
System.out.println("2:"+xml2 );
SourceMessages messages = new SourceMessages();
TreeList doc = new TreeList();
XMLParser parser
= new XMLParser(new LineBufferedReader(new java.io.StringReader(xml1)),
(Consumer)new NamespaceResolver(doc), messages);
doc.beginDocument();
parser.parse();
if (messages.seenErrors()) {
System.out.println(messages.getErrors());
throw new SyntaxException(messages);
}
doc.endDocument();
//So kanjis as values may be used
System.out.println("Parsed : " + doc.toString());
//And again for the problematic document
doc = new TreeList();
parser
= new XMLParser(new LineBufferedReader(new java.io.StringReader(xml2)),
(Consumer)new NamespaceResolver(doc), messages);
doc.beginDocument();
parser.parse();
if (messages.seenErrors()) {
System.out.println(messages.getErrors());
throw new SyntaxException(messages);
}
doc.endDocument();
//Never reaching this `cause kanjis may not be tags
System.out.println("Parsed : " + doc.toString());
}
}