Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SAX Parser characters method doesn't collect all content

Tags:

java

xml

sax

I'm using SAX parser to parse XML and is working fine.

I have below tag in XML.

<value>•CERTASS >> Certass</value>

Here I expect '•CERTASS >> Certass' as output. but below code returns only Certass. Is there any issue with the special chars of value tag?

public void characters(char[] buffer, int start, int length) {
           temp = new String(buffer, start, length);
    }
like image 903
user755806 Avatar asked Jul 22 '15 16:07

user755806


1 Answers

It is not guaranteed that the characters() method will run only once inside an element.

If you are storing the content in a String, and the characters() method happens to run twice, you will only get the content from the second run. The second time that the characters method runs it will overwrite the contents of your temp variable that was stored from the first time.

To remedy this, use a StringBuilder and append() the contents in characters() and then process the contents in endElement(). For example:

 DefaultHandler handler = new DefaultHandler() {
     private StringBuilder stringBuilder;

     @Override
     public void startElement(String uri, String localName,String qName, Attributes attributes) throws SAXException {
         stringBuilder = new StringBuilder();
     }

     public void characters(char[] buffer, int start, int length) {
         stringBuilder.append(new String(buffer, start, length));
     }

     public void endElement(String uri, String localName, String qName) throws SAXException {
         System.out.println(stringBuilder.toString());
     }
 };

Parsing the String "<value>•CERTASS >> Certass</value>" and the handler above gives the output:

?CERTASS >> Certass

I hope this helps.

like image 91
Rudi Kershaw Avatar answered Oct 05 '22 22:10

Rudi Kershaw