I want to parse the DOCTYPE
of a page with Jsoup to discover the version of HTML (HTML 5, HTML 4, XHTML, etc.).
Is possible to parse the DOCTYPE
with Jsoup to handle it? If not is, there is a way to achieve the main objective that is discovering the version of page HTML?
Jsoup
has DocumentType
class for this purposes:
List<Node>nods = doc.childNodes();
for (Node node : nods) {
if (node instanceof DocumentType) {
DocumentType documentType = (DocumentType)node;
System.out.println(documentType.toString());
System.out.println(DocumentType.attr("publicid"));
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With