Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

best java Xml parser to manipulate/edit an existing xml document

Tags:

java

parsing

xml

TASK : I have an existing xml document (UTF-8) which uses xml namespaces and xml schema. I need to parse to a particular element, append content (that also needs to use xml namespace prefixes) to this element and then write out the Document again.

which is the best XML parser library that I should be using for this TASK ?

I've seen a previous thread (Best XML parser for Java) but was not sure if dom4j or JDOM is any good for namespaces/xmlSchema and good support for UTF-8 characters.

Some parsers that seems like a task for
JDom
Dom4J
XOM
WoodStock

Any idea which one is the best ? :-) I use JDK 6 and would prefer NOT to use the built-in SAX/DOM facilities to do this job because that requires me to write too much code.

Would help to have some examples of doing such a task.

like image 721
anjanb Avatar asked Mar 26 '10 13:03

anjanb


People also ask

What is the easiest way to edit an XML file?

From the Project menu, select Add New Item. Select XML File from the Templates pane. Enter the filename in the Name field and press Add. The XML file is added to the project and opens in the XML editor.


2 Answers

Use XSLT. Seriously. This is a perfect job for it. Just use a copy template to copy everything as is except for the place where you need to add more xml. You can even add the XML by actually writing XML instead of DOM manipulation.

This is the copy template:

<xsl:template match="node() | @*">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

I know a lot of people hate XSLT, but this is a task where it would really shine and take almost no code. Also, you could just use what's in the JDK.

like image 42
Russell Leggett Avatar answered Oct 14 '22 07:10

Russell Leggett


Using JDOM, taking an InputStream and making it a Document:

InputStream inputStream = (InputStream)httpURLConnection.getContent();
DocumentBuilderFactory docbf = DocumentBuilderFactory.newInstance();
docbf.setNamespaceAware(true);
DocumentBuilder docbuilder = docbf.newDocumentBuilder();
Document document = docbuilder.parse(inputStream, baseUrl);

At that point, you have the XML in a Java object. Done. Easy.

You can either use the document object and the Java API to just walk through it, or also use XPath, which I find easier (once I learned it).

Build an XPath object, which takes a bit:

public static XPath buildXPath() {
    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    xpath.setNamespaceContext(new AtomNamespaceContext());
    return xpath;
}


public class AtomNamespaceContext implements NamespaceContext {

    public String getNamespaceURI(String prefix) {
        if (prefix == null)
            throw new NullPointerException("Null prefix");
        else if ("a".equals(prefix))
            return "http://www.w3.org/2005/Atom";
        else if ("app".equals(prefix))
            return "http://www.w3.org/2007/app";
        else if ("os".equals(prefix))
            return "http://a9.com/-/spec/opensearch/1.1/";
        else if ("x".equals(prefix)) 
            return "http://www.w3.org/1999/xhtml";
        else if ("xml".equals(prefix))
            return XMLConstants.XML_NS_URI;
        return XMLConstants.NULL_NS_URI;
    }

    // This method isn't necessary for XPath processing.
    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    // This method isn't necessary for XPath processing either.
    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }
}

Then just use it, which (thankfully) doesn't take much time at all:

return Integer.parseInt(xpath.evaluate("/a:feed/os:totalResults/text()", document));
like image 193
Dean J Avatar answered Oct 14 '22 07:10

Dean J