Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is DocumentBuilder.parse() not working

Tags:

java

xml

I have read several posts about how to use the DocumentBuilder.parse() function for getting a document object.

Document document = builder.parse(new InputSource(new StringReader(xml)));

was returning [#document: null] which as I found does not necessarily mean it is empty. However, after inspecting it more, I have found that it is in fact empty.

I am building the String xml and have used an xml validator, (and pasted into eclipse and ctrl+shift+f to format it. This is usually my first try to see if something is well formed) to show it is valid xml. I decided to break out each part of the parse() parameters so I could step through and watch to make sure they were working correctly.

My code is:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;        
try {
    builder = factory.newDocumentBuilder();
    StringReader sr = new StringReader(xml);
    InputSource is = new InputSource(sr);
    Document document = builder.parse(is);          

    return document;
} catch(Exception e){
    e.printStackTrace();
}

sr and is appear to work correctly until I execute the builder.parse(is) line. As soon as this executes, the sr.str value becomes null and same with is.characterInputStream.str. This seems odd to me, is this expected? This has been driving me crazy, any input would be great!

edit- my xml string is:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
    <channel>
        <title>Feed Title</title>
        <link>Feed Link</link>
        <description>Feed Description</description>
        <item>
            <title>Item Title</title>
            <link>Item Link</link>
            <description>Item Description</description>
        </item>
        <item>
            <title>Another Item</title>
            <link>Another Link</link>
            <description>Another Description</description>
        </item>
    </channel>
</rss>
like image 465
Aheinlein Avatar asked Apr 10 '14 14:04

Aheinlein


1 Answers

As soon as this executes, the sr.str value becomes null and same with is.characterInputStream.str. This seems odd to me, is this expected?

Yes, I'd say so. DocumentBuilder.parse is closing the reader. StringReader.close() sets str to null. This is an implementation detail of StringReader - but you should expect to see implementation details when you poke around private fields when debugging. (It's also not documented that DocumentBuilder.parse will close the input it's given, but it seems reasonable.)

It's unclear what the problem is with your XML, but this part of the behaviour is entirely reasonable.

I would strongly recommend that you try your code with the simplest XML you can think of, e.g. "<foo />".

The code you've shown so far is fine. Here's a short but complete program to show it working:

import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
import java.io.*;

class Test {
   public static void main(String [] args) throws Exception {
       DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
       DocumentBuilder builder;
       builder = factory.newDocumentBuilder();
       StringReader sr = new StringReader("<foo />");
       InputSource is = new InputSource(sr);
       Document document = builder.parse(is);
       System.out.println(document.getDocumentElement().getTagName());
   }
}
like image 181
Jon Skeet Avatar answered Nov 08 '22 05:11

Jon Skeet