Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why am I getting extra text nodes as child nodes of root node?

Tags:

I want to print the child elements of the root node. This is my XML file.

<?xml version="1.0"?> <!-- Comment--> <company>    <staff id="1001">        <firstname>yong</firstname>        <lastname>mook kim</lastname>        <nickname>mkyong</nickname>        <salary>100000</salary>    </staff>    <staff id="2001">        <firstname>low</firstname>        <lastname>yin fong</lastname>        <nickname>fong fong</nickname>        <salary>200000</salary>    </staff> </company> 

According to my understanding, root node is 'company' and its child nodes must be 'staff' and 'staff' (as there are 'staff' nodes 2 times). But when I am trying to get them through my java code I am getting 5 child nodes. Where are the 3 extra text nodes coming from?

Java Code:

package com.training.xml;  import java.io.File; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList;  public class ReadingXML {  public static void main(String[] args) {     try {          File file = new File("D:\\TestFile.xml");          DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();         DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();         Document doc = dBuilder.parse(file);         doc.getDocumentElement().normalize();          System.out.println("root element: " + doc.getDocumentElement().getNodeName());          Node rootNode = doc.getDocumentElement();          System.out.println("root: " + rootNode.getNodeName());          NodeList nList = rootNode.getChildNodes();           for(int i = 0; i < nList.getLength(); i++) {             System.out.println("node name: " + nList.item(i).getNodeName() );         }                } catch(Exception e) {         e.printStackTrace();     } } } 

OUTPUT:

root element: company root: company node name: #text node name: staff node name: #text node name: staff node name: #text 

Why the three text nodes are coming over here?

like image 232
Vikas Mangal Avatar asked Nov 28 '13 07:11

Vikas Mangal


1 Answers

Why the three text nodes are coming over here ?

They're the whitespace between the child elements. If you only want the child elements, you should just ignore nodes of other types:

for (int i = 0;i < nList.getLength(); i++) {     Node node = nList.item(i);     if (node.getNodeType() == Node.ELEMENT_NODE) {         System.out.println("node name: " + node.getNodeName());     } } 

Or you could change your document to not have that whitespace.

Or you could use a different XML API which allows you to easily ask for just elements. (The DOM API is a pain in various ways.)

If you only want to ignore element content whitespace, you can use Text.isElementContentWhitespace.

like image 167
Jon Skeet Avatar answered Oct 11 '22 17:10

Jon Skeet