Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make Transformer not remove <!DOCTYPE html>?

Tags:

java

xml

When I pass HTML5 code through a javax.xml.transform.Transformer, the <!DOCTYPE html> doctype gets removed. Here is a sample code:

public static void main(String[] args) {
    StreamSource source = new StreamSource(
        IOUtils.toInputStream(
            Joiner.on('\n').join(
                "<!DOCTYPE html>",
                "<html>",
                "<head>",
                "</head>",
                "<body>",
                "</body>",
                "</html>"
            )
        )
    );
    ByteArrayOutputStream result = new ByteArrayOutputStream();
    try {
        Transformer transformer = TransformerFactory
            .newInstance()
            .newTransformer();
        transformer.transform(source, new StreamResult(result));
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
    System.out.println(
       result.toString()
    );
}

Output is:

<html>

<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

</head>

<body>

</body>

</html>

What can I do to preserve <!DOCTYPE html>?

like image 323
gvlasov Avatar asked Sep 13 '25 09:09

gvlasov


1 Answers

The DOCTYPE declaration isn't part of the data model, so the XSLT transformer has no idea it is there, so it can't preserve it. Also, the syntax <!DOCTYPE html> wasn't around when XSLT 1.0 (and even 2.0) were standardised, so there's not even a standard way of generating it. But see Set HTML5 doctype with XSLT for workarounds.

like image 113
Michael Kay Avatar answered Sep 14 '25 23:09

Michael Kay