I have some non well-formed xml (HTML) data in JAVA, I used JAXP Dom, but It complains.
The Question is :Is there any way to use JAXP to parse such documents ??
I have a file containing data such as :
<employee>
<name value="ahmed" > <!-- note, this element is not closed, So it is not well-formed xml-->
</employee>
You will need to post more information. "XML Parsing Error" occurs when something is trying to read the XML, not when it is being generated. Also, "not well-formed" usually refers to errors in the structure of the document, such as a missing end-tag, not the characters it contains.
You can try parsing an HTML file using a XML parser, but it's likely to fail. The reason is that HTML documents can have the following HTML features that XML parsers don't understand. XML parsers will fail to parse any HTML document that uses any of those features.
If the document is not well-formed, the XML processor should report one or more errors encountered, and normal processing, including the passing of parsed data to the application, should stop.
You could try running your document through the jtidy API first - that has the ability to convert html into valid xhtml: http://jtidy.sourceforge.net/howto.html
Tidy tidy = new Tidy();
tidy.setXHTML(true);
tidy.parse(......)...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With