Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to parse value containing special character? Using sax parser

Tags:

java

parsing

sax

I am new to parsing field. I'm trying to write a parser code but unable to get the value with respect to a particular tag that value contains ampersand(&). Please help me to get the solution.

My xml file looks like

<system>
<u_id>10145</u_id>
<serial_no>1800015</serial_no>
<branch_name>B & P Infotech Ltd.</branch_name>
</system>

and I have tried with this java code, but it's not giving me proper output.

main class

package com.satya.xmltest;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

public class SaxTest {

    public static void main(String[] args) {
        SAXParserFactory parserFactory = SAXParserFactory.newInstance();
        SaxtestHandler handler=new SaxtestHandler();
        try {
            SAXParser parser = parserFactory.newSAXParser();
            parser.parse("C:\\Users\\abc\\Desktop\\test.xml", handler);
        } catch (Exception e) {
        }
        SystemTo systemTo=handler.systemTo;
        System.out.println("Uid :"+systemTo.getUid());
        System.out.println("serial number :"+systemTo.getSerialNumber());
        System.out.println("name :"+systemTo.getName());
    }
}

Handler class

In this class the parsing is done and setting the data values to data container class.

package com.satya.xmltest;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SaxtestHandler extends DefaultHandler {
    String content = "";
    SystemTo systemTo=new SystemTo();

    @Override
    public void startElement(String uri, String localName, String qName,
        Attributes attributes) throws SAXException {

        switch (qName) {
            case "system":
                System.out.println("inside company");
                break;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName)
        throws SAXException {
        switch (qName) {
            case "u_id":
                systemTo.setUid(content);
                break;
            case "serial_no":
                systemTo.setSerialNumber(content);
                break;
            case "branch_name":
                systemTo.setName(content);
                break;
        }
    }

    @Override
    public void characters(char[] ch, int start, int length)
        throws SAXException {
        content = String.copyValueOf(ch, start, length).trim();
    }
}

Data container class

package com.satya.xmltest;

public class SystemTo {

    private String uid;
    private String serialNumber;
    private String name;
    public String getUid() {
        return uid;
    }
    public void setUid(String uid) {
        this.uid = uid;
    }
    public String getSerialNumber() {
        return serialNumber;
    }
    public void setSerialNumber(String serialNumber) {
        this.serialNumber = serialNumber;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
}

My output is:

Uid: 10145
serial number: 1800015
name: null

But I need:

Uid: 10145
serial number: 1800015
name: B & P Infotech Ltd.

Thanks in advance.

like image 726
Satya Avatar asked Jan 12 '23 08:01

Satya


1 Answers

There are some characters in XML that must not appear in their literal form in an XML document, except when used as markup delimiters or within a comment, a processing instruction, or a CDATA section.
List of characters and their corresponding entity or the numeric reference to replace :

Original Character    XML entity replacement      XML numeric replacement

      "                     &quot;                       &#34;   
      <                     &lt;                         &#60;   
      >                     &gt;                         &#62;
      &                     &amp;                        &#38;
      '                     &apos;                       &#39;   

you must replace above character in XML before you parse it.

You may use CDATA Section for text that is not markup constitutes the character data of the document

like image 190
Yagnesh Agola Avatar answered Jan 28 '23 02:01

Yagnesh Agola