Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a node's inner XML as String in Java DOM

Tags:

java

dom

xml

I have an XML org.w3c.dom.Node that looks like this:

<variable name="variableName">
    <br /><strong>foo</strong> bar
</variable>

How do I get the <br /><strong>foo</strong> bar part as a String?

like image 799
Marjan Avatar asked Jul 21 '10 15:07

Marjan


People also ask

Which is the valid code to extract the root of an XML document in a DOM parser?

Instantiate XML file: DOM parser loads the XML file into memory and consider every tag as an element. Get root node: Document class provides the getDocumentElement() method to get the root node and the element of the XML file.

How Dom parses the XML file?

The Document Object Model(DOM) provides APIs that let you create, modify, delete, and rearrange nodes as needed. The DOM parser parses the entire XML document and loads the XML content into a Tree structure. Using the Node and NodeList classes, we can retrieve and modify the contents of an XML file.

Can the Dom help you find specific elements in an XML file?

The XML Document Object Model (DOM) contains methods that allow you to use XML Path Language (XPath) navigation to query information in the DOM. You can use XPath to find a single, specific node or to find all nodes that match some criteria.

What is XML DOM in Java?

The HTML DOM defines a standard way for accessing and manipulating HTML documents. It presents an HTML document as a tree-structure. The XML DOM defines a standard way for accessing and manipulating XML documents. It presents an XML document as a tree-structure.


7 Answers

Same problem. To solve it I wrote this helper function:

public String innerXml(Node node) {
    DOMImplementationLS lsImpl = (DOMImplementationLS)node.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
    LSSerializer lsSerializer = lsImpl.createLSSerializer();
    NodeList childNodes = node.getChildNodes();
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < childNodes.getLength(); i++) {
       sb.append(lsSerializer.writeToString(childNodes.item(i)));
    }
    return sb.toString(); 
}
like image 183
Andrey M. Avatar answered Sep 28 '22 08:09

Andrey M.


There is no simple method on org.w3c.dom.Node for this. getTextContent() gives the text of each child node concatenated together. getNodeValue() will give you the text of the current node if it is an Attribute,CDATA or Text node. So you would need to serialize the node using a combination of getChildNodes(), getNodeName() and getNodeValue() to build the string.

You can also do it with one of the various XML serialization libraries that exist. There is XStream or even JAXB. This is discussed here: XML serialization in Java?

like image 36
Robert Diana Avatar answered Sep 28 '22 07:09

Robert Diana


If you dont want to resort to external libraries, the following solution might come in handy. If you have a node <parent><child name="Nina"/></parent> and you want to extract the children of the parent element proceed as follows:

    StringBuilder resultBuilder = new StringBuilder();
    // Get all children of the given parent node
    NodeList children = parent.getChildNodes();
    try {

        // Set up the output transformer
        TransformerFactory transfac = TransformerFactory.newInstance();
        Transformer trans = transfac.newTransformer();
        trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        trans.setOutputProperty(OutputKeys.INDENT, "yes");
        StringWriter stringWriter = new StringWriter();
        StreamResult streamResult = new StreamResult(stringWriter);

        for (int index = 0; index < children.getLength(); index++) {
            Node child = children.item(index);

            // Print the DOM node
            DOMSource source = new DOMSource(child);
            trans.transform(source, streamResult);
            // Append child to end result
            resultBuilder.append(stringWriter.toString());
        }
    } catch (TransformerException e) {
        //Error handling goes here
    }
    return resultBuilder.toString();
like image 27
AgentKnopf Avatar answered Sep 28 '22 08:09

AgentKnopf


If you're using jOOX, you can wrap your node in a jquery-like syntax and just call toString() on it:

$(node).toString();

It uses an identity-transformer internally, like this:

ByteArrayOutputStream out = new ByteArrayOutputStream();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
Source source = new DOMSource(element);
Result target = new StreamResult(out);
transformer.transform(source, target);
return out.toString();
like image 20
Lukas Eder Avatar answered Sep 28 '22 08:09

Lukas Eder


Extending on Andrey M's answer, I had to slightly modify the code to get the complete DOM document. If you just use the

 NodeList childNodes = node.getChildNodes();

It didn't include the root element for me. To include the root element (and get the complete .xml document) I used:

 public String innerXml(Node node) {
     DOMImplementationLS lsImpl = (DOMImplementationLS)node.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
     LSSerializer lsSerializer = lsImpl.createLSSerializer();
     lsSerializer.getDomConfig().setParameter("xml-declaration", false);
     StringBuilder sb = new StringBuilder();
     sb.append(lsSerializer.writeToString(node));
     return sb.toString(); 
 }
like image 44
Alan Avatar answered Sep 28 '22 08:09

Alan


I had the problem with the last answer that method 'nodeToStream()' is undefined; therefore, my version here:

    public static String toString(Node node){
    String xmlString = "";
    try {
        Transformer transformer = TransformerFactory.newInstance().newTransformer();
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        //transformer.setOutputProperty(OutputKeys.INDENT, "yes");

        Source source = new DOMSource(node);

        StringWriter sw = new StringWriter();
        StreamResult result = new StreamResult(sw);

        transformer.transform(source, result);
        xmlString = sw.toString ();

    } catch (Exception ex) {
        ex.printStackTrace ();
    }

    return xmlString;
}
like image 23
MatEngel Avatar answered Sep 28 '22 06:09

MatEngel


I want to extend the very good answer from Andrey M.:

It can happen that a node is not serializeable and this results in the following exception on some implementations:

org.w3c.dom.ls.LSException: unable-to-serialize-node: 
            unable-to-serialize-node: The node could not be serialized.

I had this issue with the implementation "org.apache.xml.serialize.DOMSerializerImpl.writeToString(DOMSerializerImpl)" running on Wildfly 13.

To solve this issue I would suggest to change the code example from Andrey M. a little bit:

private static String innerXml(Node node) {
    DOMImplementationLS lsImpl = (DOMImplementationLS) node.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
    LSSerializer lsSerializer = lsImpl.createLSSerializer();
    lsSerializer.getDomConfig().setParameter("xml-declaration", false); 
    NodeList childNodes = node.getChildNodes();
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < childNodes.getLength(); i++) {
        Node innerNode = childNodes.item(i);
        if (innerNode!=null) {
            if (innerNode.hasChildNodes()) {
                sb.append(lsSerializer.writeToString(innerNode));
            } else {
                sb.append(innerNode.getNodeValue());
            }
        }
    }
    return sb.toString();
}

I also added the comment from Nyerguds. This works for me in wildfly 13.

like image 21
Ralph Avatar answered Sep 28 '22 07:09

Ralph