I want to print the child elements of the root node. This is my XML file.
<?xml version="1.0"?> <!-- Comment--> <company> <staff id="1001"> <firstname>yong</firstname> <lastname>mook kim</lastname> <nickname>mkyong</nickname> <salary>100000</salary> </staff> <staff id="2001"> <firstname>low</firstname> <lastname>yin fong</lastname> <nickname>fong fong</nickname> <salary>200000</salary> </staff> </company>
According to my understanding, root node is 'company' and its child nodes must be 'staff' and 'staff' (as there are 'staff' nodes 2 times). But when I am trying to get them through my java code I am getting 5 child nodes. Where are the 3 extra text nodes coming from?
Java Code:
package com.training.xml; import java.io.File; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList; public class ReadingXML { public static void main(String[] args) { try { File file = new File("D:\\TestFile.xml"); DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); Document doc = dBuilder.parse(file); doc.getDocumentElement().normalize(); System.out.println("root element: " + doc.getDocumentElement().getNodeName()); Node rootNode = doc.getDocumentElement(); System.out.println("root: " + rootNode.getNodeName()); NodeList nList = rootNode.getChildNodes(); for(int i = 0; i < nList.getLength(); i++) { System.out.println("node name: " + nList.item(i).getNodeName() ); } } catch(Exception e) { e.printStackTrace(); } } }
OUTPUT:
root element: company root: company node name: #text node name: staff node name: #text node name: staff node name: #text
Why the three text nodes are coming over here?
Why the three text nodes are coming over here ?
They're the whitespace between the child elements. If you only want the child elements, you should just ignore nodes of other types:
for (int i = 0;i < nList.getLength(); i++) { Node node = nList.item(i); if (node.getNodeType() == Node.ELEMENT_NODE) { System.out.println("node name: " + node.getNodeName()); } }
Or you could change your document to not have that whitespace.
Or you could use a different XML API which allows you to easily ask for just elements. (The DOM API is a pain in various ways.)
If you only want to ignore element content whitespace, you can use Text.isElementContentWhitespace
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With