I have build a Jsoup Document by parsing a in-house HTML page,
public Document newDocument(String path) throws IOException {
Document doc = null;
doc = Jsoup.connect(path).timeout(0).get();
return new HtmlDocument<Document>(doc);
}
I would want to convert the Jsoup document to my org.w3c.dom.Document
I used an available library DOMBuilder for this but when parsing I get org.w3c.dom.Document
as null. I am unable to understand the problem, tried searching but couldnt find any answer.
Code to generate the W3C DOM Document :
Document jsoupDoc=factory.newDocument("http:localhost/testcases/test_2.html"));
org.w3c.dom.Document docu= DOMBuilder.jsoup2DOM(jsoupDoc);
Can anyone please help me on this?
Alternatively, Jsoup provides the W3CDom class with the method fromJsoup
. This method transforms a Jsoup Document into a W3C document.
Document jsoupDoc = ...
W3CDom w3cDom = new W3CDom();
org.w3c.dom.Document w3cDoc = w3cDom.fromJsoup(jsoupDoc);
UPDATE:
To retrieve a jsoup document via HTTP, make a call to Jsoup.connect(...).get()
. To load a jsoup document locally, make a call to Jsoup.parse(new File("..."), "UTF-8")
.
The call to DomBuilder
is correct.
When you say,
I used an available library DOMBuilder for this but when parsing I get org.w3c.dom.Document as null.
I think you mean, "I used an available library, DOMBuilder, for this but when printing the result, I get [#document: null]
." At least, that was the result I saw when I tried printing the w3cDoc
object - but that doesn't mean the object is null. I was able to traverse the document by making calls to getDocumentElement
and getChildNodes
.
public static void main(String[] args) {
Document jsoupDoc = null;
try {
jsoupDoc = Jsoup.connect("http://stackoverflow.com/questions/17802445").get();
} catch (IOException e) {
e.printStackTrace();
}
org.w3c.dom.Document w3cDoc= DOMBuilder.jsoup2DOM(jsoupDoc);
Element e = w3cDoc.getDocumentElement();
NodeList childNodes = e.getChildNodes();
Node n = childNodes.item(2);
System.out.println(n.getNodeName());
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With