Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is possible parse the DOCTYPE with Jsoup to discover HTML version?

I want to parse the DOCTYPE of a page with Jsoup to discover the version of HTML (HTML 5, HTML 4, XHTML, etc.).

Is possible to parse the DOCTYPE with Jsoup to handle it? If not is, there is a way to achieve the main objective that is discovering the version of page HTML?

like image 698
Renato Dinhani Avatar asked Apr 11 '12 14:04

Renato Dinhani


1 Answers

Jsoup has DocumentType class for this purposes:

List<Node>nods = doc.childNodes();
         for (Node node : nods) {
            if (node instanceof DocumentType) {
                DocumentType documentType = (DocumentType)node;
                  System.out.println(documentType.toString());
                  System.out.println(DocumentType.attr("publicid"));
            }
        }
like image 69
vacuum Avatar answered Oct 17 '22 00:10

vacuum



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!