This is the mail archive of the kawa@sources.redhat.com mailing list for the Kawa project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

XMLParser Error


Hello !
 While calling 

    TreeList doc = new TreeList();
    XMLParser parser
        = new XMLParser(new LineBufferedReader(new java.io.StringReader(document)),
                        (Consumer)new NamespaceResolver(doc), messages);
    doc.beginDocument();
    parser.parse();
 
 <unknown>:1:3: missing name
 Exception in thread "main" gnu.text.SyntaxException
 
 is thrown .

 document is a String which contains a simple XML.
 The problem is that it contains japanese characters(kaji).
 Something like
  
   <name>ONE_KANJI</name>  
  
 is parsed without errors but 

   <ONE_KANJI>Test</ONE_KANJI>

 gives the above error as if it would represent an invalid XML.
 But the second String is an valid XML and can be parsed with javax.sax.DocumentBuilder.

 I`m attaching a simple test case.It has been saved as EUC-JP(with emacs on a debian box)  so compiling may need -encoding EUC-JP.For correct kanji display kterm may be used (I think
it`s available on any linux distribution. 
 If compiling failes I can provide a complied class if it would help.
 This is a tiny part of project that is supposed to be used in Japan so japanese is a must.
 I`ve got some good feedback from this list in the past so I dear to ask for a bit of your time
again.
 Regards.  
-- 
Lucian
import gnu.xml.NamespaceResolver;
import gnu.lists.*;
import gnu.text.*;
import gnu.kawa.xml.*;
public class Test {
    public static void main(String [] args) throws java.lang.Throwable,java.io.IOException,gnu.text.SyntaxException{
	String xml1 = "<t>ÅÄ</t>";
	String xml2 = "<ÅÄ>t</ÅÄ>";
	System.out.println("1:"+xml1 );
	System.out.println("2:"+xml2 );
	SourceMessages messages = new SourceMessages();
	TreeList doc = new TreeList();
	XMLParser parser
	    = new XMLParser(new LineBufferedReader(new java.io.StringReader(xml1)),
                        (Consumer)new NamespaceResolver(doc), messages);
	doc.beginDocument();
	parser.parse();
	if (messages.seenErrors()) {
	    System.out.println(messages.getErrors());
	     throw new SyntaxException(messages);
	}
	doc.endDocument();
	//So kanjis as values may be used 
				      
        System.out.println("Parsed : " + doc.toString());
	
        //And again for the problematic document 
	doc = new TreeList();
	parser
        = new XMLParser(new LineBufferedReader(new java.io.StringReader(xml2)),
                        (Consumer)new NamespaceResolver(doc), messages);
	doc.beginDocument();
	parser.parse();
	if (messages.seenErrors()) {
	    System.out.println(messages.getErrors());
	    throw new SyntaxException(messages);
	}
	doc.endDocument();
	//Never reaching this `cause kanjis may not be tags
	System.out.println("Parsed : " + doc.toString());
    }
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]