Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting XML Node text value with Java DOM

Tags:

java

dom

xml

I can't fetch text value with Node.getNodeValue(), Node.getFirstChild().getNodeValue() or with Node.getTextContent().

My XML is like

<add job="351">     <tag>foobar</tag>     <tag>foobar2</tag> </add> 

And I'm trying to get tag value (non-text element fetching works fine). My Java code sounds like

Document doc = db.parse(new File(args[0])); Node n = doc.getFirstChild(); NodeList nl = n.getChildNodes();    Node an,an2;  for (int i=0; i < nl.getLength(); i++) {     an = nl.item(i);      if(an.getNodeType()==Node.ELEMENT_NODE) {         NodeList nl2 = an.getChildNodes();          for(int i2=0; i2<nl2.getLength(); i2++) {             an2 = nl2.item(i2);              // DEBUG PRINTS             System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):");              if(an2.hasChildNodes())                 System.out.println(an2.getFirstChild().getTextContent());              if(an2.hasChildNodes())                 System.out.println(an2.getFirstChild().getNodeValue());              System.out.println(an2.getTextContent());             System.out.println(an2.getNodeValue());         }     } } 

It prints out

tag type (1):  tag1 tag1 tag1 null #text type (3): _blank line_ _blank line_ ... 

Thanks for the help.

like image 546
Emilio Avatar asked Apr 21 '09 14:04

Emilio


People also ask

How do I read an XML string in Java?

In this article, you will learn three ways to read XML files as String in Java, first by using FileReader and BufferedReader, second by using DOM parser, and third by using open-source XML library jcabi-xml.

Can the DOM help you find specific elements in an XML file?

The XML Document Object Model (DOM) contains methods that allow you to use XML Path Language (XPath) navigation to query information in the DOM. You can use XPath to find a single, specific node or to find all nodes that match some criteria.

Which is the valid code to extract the root of an XML Document in a DOM parser?

Instantiate XML file: DOM parser loads the XML file into memory and consider every tag as an element. Get root node: Document class provides the getDocumentElement() method to get the root node and the element of the XML file.


2 Answers

I'd print out the result of an2.getNodeName() as well for debugging purposes. My guess is that your tree crawling code isn't crawling to the nodes that you think it is. That suspicion is enhanced by the lack of checking for node names in your code.

Other than that, the javadoc for Node defines "getNodeValue()" to return null for Nodes of type Element. Therefore, you really should be using getTextContent(). I'm not sure why that wouldn't give you the text that you want.

Perhaps iterate the children of your tag node and see what types are there?

Tried this code and it works for me:

String xml = "<add job=\"351\">\n" +              "    <tag>foobar</tag>\n" +              "    <tag>foobar2</tag>\n" +              "</add>"; DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); ByteArrayInputStream bis = new ByteArrayInputStream(xml.getBytes()); Document doc = db.parse(bis); Node n = doc.getFirstChild(); NodeList nl = n.getChildNodes(); Node an,an2;  for (int i=0; i < nl.getLength(); i++) {     an = nl.item(i);     if(an.getNodeType()==Node.ELEMENT_NODE) {         NodeList nl2 = an.getChildNodes();          for(int i2=0; i2<nl2.getLength(); i2++) {             an2 = nl2.item(i2);             // DEBUG PRINTS             System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):");             if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getTextContent());             if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getNodeValue());             System.out.println(an2.getTextContent());             System.out.println(an2.getNodeValue());         }     } } 

Output was:

#text: type (3): foobar foobar #text: type (3): foobar2 foobar2 
like image 74
jsight Avatar answered Sep 28 '22 23:09

jsight


If your XML goes quite deep, you might want to consider using XPath, which comes with your JRE, so you can access the contents far more easily using:

String text = xp.evaluate("//add[@job='351']/tag[position()=1]/text()",      document.getDocumentElement()); 

Full example:

import static org.junit.Assert.assertEquals; import java.io.StringReader;     import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathFactory;     import org.junit.Before; import org.junit.Test; import org.w3c.dom.Document; import org.xml.sax.InputSource;  public class XPathTest {      private Document document;      @Before     public void setup() throws Exception {         String xml = "<add job=\"351\"><tag>foobar</tag><tag>foobar2</tag></add>";         DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();         DocumentBuilder db = dbf.newDocumentBuilder();         document = db.parse(new InputSource(new StringReader(xml)));     }      @Test     public void testXPath() throws Exception {         XPathFactory xpf = XPathFactory.newInstance();         XPath xp = xpf.newXPath();         String text = xp.evaluate("//add[@job='351']/tag[position()=1]/text()",                 document.getDocumentElement());         assertEquals("foobar", text);     } } 
like image 44
toolkit Avatar answered Sep 29 '22 00:09

toolkit