Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why getting null node value while parsing XML

Tags:

java

parsing

xml

While parsing the below XML .First url-malformed-exception was coming while parsing so in the code instead of giving the xml String i used this code

Document doc=dBuilder.parse(newInputSource(newByteArrayInputStream(xmlResponse.getBytes("utf-8"))));

according to this link

java.net.MalformedURLException: no protocol

now i am getting the node value as null .How can i overcome this .In the code in for loop i have mentioned where the null value for node is coming

i am using following code:

try {
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    Document doc = dBuilder.parse(new InputSource(new ByteArrayInputStream(xmlResponse.getBytes("utf-8"))));
    //read this - https://stackoverflow.com/questions/13786607/normalization-in-dom-parsing-with-java-how-does-it-work
    doc.getDocumentElement().normalize();
    System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
    XPath xPath =  XPathFactory.newInstance().newXPath()
    String expression = "/GetMatchingProductForIdResponse/GetMatchingProductForIdResult/Products/Product"
    System.out.println(expression)
    NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET)
    System.out.println("the size will be of the node list ${nodeList.getLength()}");
    for (int i = 0; i < nodeList.getLength(); i++) {
            System.out.println(nodeList.item(i).getNodeValue()+"the value coming will be "); // here i am getting value null for each node
    }
} catch (Exception e) {
    e.printStackTrace(System.out);
}

to parse the XML:

<?xml version="1.0"?>
<GetMatchingProductForIdResponse   xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01">
  <GetMatchingProductForIdResult Id="H5-9OSH-9NZ7" IdType="SellerSKU" status="Success">
    <Products xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01" xmlns:ns2="http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd">
    <Product>
      <Identifiers>
        <MarketplaceASIN>
          <MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
          <ASIN>B004FQLAH2</ASIN>
        </MarketplaceASIN>
      </Identifiers>
      <AttributeSets>
        <ns2:ItemAttributes xml:lang="en-US">
          <ns2:Binding>Office Product</ns2:Binding>
          <ns2:Brand>Konica-Minolta</ns2:Brand>
          <ns2:Color>Y</ns2:Color>
          <ns2:CPUSpeed Units="MHz">200</ns2:CPUSpeed>
          <ns2:Department>Printers</ns2:Department>
          <ns2:Feature>Amp Up your Output - The magicolor 3730DN business color laser printer outputs at speeds up to 25 ppm in both color and B&W which means you can keep up in just about any business environment.</ns2:Feature>
          <ns2:Feature>Unparalleled Image Quality - High resolution 2400 (equivalent) x 600 dpi printing for great color and clarity in both images and text.</ns2:Feature>
          <ns2:Feature>Happy Planet, Outstanding Printing - Simitri HD Toner with Biomass allows for outstanding printing with the environment in mind.</ns2:Feature>
          <ns2:Feature>Connect quicker - Why wait? Standard Ethernet and high-speed USB 2.0 gets you connected faster than ever before.Specifications</ns2:Feature>
          <ns2:Feature>Type - Full-Color Laser Printer</ns2:Feature>
          <ns2:ItemDimensions>
            <ns2:Height Units="inches">13.62</ns2:Height>
            <ns2:Length Units="inches">20.47</ns2:Length>
            <ns2:Width Units="inches">16.50</ns2:Width>
            <ns2:Weight Units="pounds">56.22</ns2:Weight>
          </ns2:ItemDimensions>
          <ns2:IsAutographed>false</ns2:IsAutographed>
          <ns2:IsMemorabilia>false</ns2:IsMemorabilia>
          <ns2:Label>Konica</ns2:Label>
          <ns2:ListPrice>
            <ns2:Amount>449.00</ns2:Amount>
            <ns2:CurrencyCode>USD</ns2:CurrencyCode>
          </ns2:ListPrice>
          <ns2:Manufacturer>Konica</ns2:Manufacturer>
          <ns2:Model>A0VD017</ns2:Model>
          <ns2:NumberOfItems>1</ns2:NumberOfItems>
          <ns2:OperatingSystem>Windows XP, Vista, 7</ns2:OperatingSystem>
          <ns2:OperatingSystem>Mac X 10.2.8, 10.6+</ns2:OperatingSystem>
          <ns2:PackageDimensions>
            <ns2:Height Units="inches">19.00</ns2:Height>
            <ns2:Length Units="inches">24.20</ns2:Length>
            <ns2:Width Units="inches">22.00</ns2:Width>
            <ns2:Weight Units="pounds">65.30</ns2:Weight>
          </ns2:PackageDimensions>
          <ns2:PackageQuantity>1</ns2:PackageQuantity>
          <ns2:PartNumber>A0VD017</ns2:PartNumber>
          <ns2:ProductGroup>CE</ns2:ProductGroup>
          <ns2:ProductTypeName>PRINTER</ns2:ProductTypeName>
          <ns2:Publisher>Konica</ns2:Publisher>
          <ns2:SmallImage>
            <ns2:URL>http://ecx.images-amazon.com/images/I/21qN3BU-BHL._SL75_.jpg</ns2:URL>
            <ns2:Height Units="pixels">75</ns2:Height>
            <ns2:Width Units="pixels">75</ns2:Width>
          </ns2:SmallImage>
          <ns2:Studio>Konica</ns2:Studio>
          <ns2:Title>Konica Minolta Magicolor 3730DN Color Laser Printer 24PPM 2400X600DPI ENET USB 2.0</ns2:Title>
        </ns2:ItemAttributes>
      </AttributeSets>
      <Relationships/>
      <SalesRankings/>
    </Product>
    </Products>
  </GetMatchingProductForIdResult>
  <ResponseMetadata>
    <RequestId>0b508338-3afe-4178-adc4-60c9c8448987</RequestId>
  </ResponseMetadata>
</GetMatchingProductForIdResponse>
like image 663
Deepak Avatar asked Dec 05 '22 08:12

Deepak


1 Answers

The getNodeValue method in the DOM is defined to always return null for element nodes (see the table at the top of the JavaDoc page for org.w3c.dom.Node for details). If you want the text inside the element then you should use getTextContent() instead.

You've added a second question in a comment to this answer asking how you can use an XPath to search for nodes that have a namespace prefix such as ns2:. The way XPath 1.0 handles namespaces is that unprefixed names always refer to nodes that are not in a namespace, and if you want to reference namespaced nodes then you have to provide a binding of namespace URIs to prefixes (which in javax.xml.xpath is the job of a NamespaceContext) and then use those prefixes in the expressions. The prefixes you use in the expression need not be the same ones as the original document used, as long as they bind to the right URIs.

Thus the original XPath you were using:

/GetMatchingProductForIdResponse/GetMatchingProductForIdResult/Products/Product

should not actually have matched anything, because the GetMatchingProductForIdResponse etc. elements in your document are in a namespace, but you got away with it because DocumentBuilderFactory is by default not namespace aware. The correct thing to do here is to use a namespace-aware parser, and provide a suitable namespace context to the XPath engine. There's no default implementation of NamespaceContext available in the core Java library, unfortunately, but Spring provides a convenient SimpleNamespaceContext implementation you can use if you don't want to roll your own.

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true); // parse with namespaces
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new InputSource(new ByteArrayInputStream(xmlResponse.getBytes("utf-8"))));
doc.getDocumentElement().normalize();

XPath xPath =  XPathFactory.newInstance().newXPath();
SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
xPath.setNamespaceContext(nsCtx);
nsCtx.bindNamespaceUri("prod", "http://mws.amazonservices.com/schema/Products/2011-10-01");
nsCtx.bindNamespaceUri("ns2", "http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd");
String expression = "/prod:GetMatchingProductForIdResponse/prod:GetMatchingProductForIdResult/prod:Products/prod:Product‌​/prod:AttributeSets/ns2:ItemAttributes/ns2:Binding";
// ...
like image 125
Ian Roberts Avatar answered Dec 18 '22 13:12

Ian Roberts