Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get line number from xml node - java

Tags:

java

xml

I have parsed an XML file and have gotten a Node that I am interested in. How can I now find the line number in the source XML file where this node occurs?

EDIT: Currently I am using the SAXParser to parse my XML. However I will be happy with a solution using any parser.

Along with the Node, I also have the XPath expression for the node.

I need to get the line number because I am displaying the XML file in a textbox, and need to highlight the line where the node occured. Assume that the XML file is nicely formatted with sufficient line breaks.

like image 796
priomsrb Avatar asked Feb 06 '11 19:02

priomsrb


2 Answers

I have got this working by following this example:

http://eyalsch.wordpress.com/2010/11/30/xml-dom-2/

This solution follows the method suggested by Michael Kay. Here is how you use it:

// XmlTest.java

import java.io.ByteArrayInputStream;
import java.io.InputStream;

import org.w3c.dom.Document;
import org.w3c.dom.Node;

public class XmlTest {
    public static void main(final String[] args) throws Exception {

        String xmlString = "<foo>\n"
                         + "    <bar>\n"
                         + "        <moo>Hello World!</moo>\n"
                         + "    </bar>\n"
                         + "</foo>";

        InputStream is = new ByteArrayInputStream(xmlString.getBytes());
        Document doc = PositionalXMLReader.readXML(is);
        is.close();

        Node node = doc.getElementsByTagName("moo").item(0);

        System.out.println("Line number: " + node.getUserData("lineNumber"));
    }
}

If you run this program, it will out put: "Line number: 3"

PositionalXMLReader is a slightly modified version of the example linked above.

// PositionalXMLReader.java

import java.io.IOException;
import java.io.InputStream;
import java.util.Stack;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.xml.sax.Attributes;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class PositionalXMLReader {
    final static String LINE_NUMBER_KEY_NAME = "lineNumber";

    public static Document readXML(final InputStream is) throws IOException, SAXException {
        final Document doc;
        SAXParser parser;
        try {
            final SAXParserFactory factory = SAXParserFactory.newInstance();
            parser = factory.newSAXParser();
            final DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
            final DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
            doc = docBuilder.newDocument();
        } catch (final ParserConfigurationException e) {
            throw new RuntimeException("Can't create SAX parser / DOM builder.", e);
        }

        final Stack<Element> elementStack = new Stack<Element>();
        final StringBuilder textBuffer = new StringBuilder();
        final DefaultHandler handler = new DefaultHandler() {
            private Locator locator;

            @Override
            public void setDocumentLocator(final Locator locator) {
                this.locator = locator; // Save the locator, so that it can be used later for line tracking when traversing nodes.
            }

            @Override
            public void startElement(final String uri, final String localName, final String qName, final Attributes attributes)
                    throws SAXException {
                addTextIfNeeded();
                final Element el = doc.createElement(qName);
                for (int i = 0; i < attributes.getLength(); i++) {
                    el.setAttribute(attributes.getQName(i), attributes.getValue(i));
                }
                el.setUserData(LINE_NUMBER_KEY_NAME, String.valueOf(this.locator.getLineNumber()), null);
                elementStack.push(el);
            }

            @Override
            public void endElement(final String uri, final String localName, final String qName) {
                addTextIfNeeded();
                final Element closedEl = elementStack.pop();
                if (elementStack.isEmpty()) { // Is this the root element?
                    doc.appendChild(closedEl);
                } else {
                    final Element parentEl = elementStack.peek();
                    parentEl.appendChild(closedEl);
                }
            }

            @Override
            public void characters(final char ch[], final int start, final int length) throws SAXException {
                textBuffer.append(ch, start, length);
            }

            // Outputs text accumulated under the current node
            private void addTextIfNeeded() {
                if (textBuffer.length() > 0) {
                    final Element el = elementStack.peek();
                    final Node textNode = doc.createTextNode(textBuffer.toString());
                    el.appendChild(textNode);
                    textBuffer.delete(0, textBuffer.length());
                }
            }
        };
        parser.parse(is, handler);

        return doc;
    }
}
like image 177
priomsrb Avatar answered Nov 15 '22 21:11

priomsrb


If you are using a SAX parser then the line number of an event can be obtained using the Locator object, which is notified to the ContentHandler via the setDocumentLocator() callback. This is called at the start of parsing, and you need to save the Locator; then after any event (such as startElement()), you can call methods such as getLineNumber() to obtain the current position in the source file. (After startElement(), the callback is defined to give you the line number on which the ">" of the start tag appears.)

like image 20
Michael Kay Avatar answered Nov 15 '22 19:11

Michael Kay