Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xml, html or xhtml in <xsl:output>: Which is the better choice?

Tags:

xml

xslt

xslt-2.0

For historic reasons we have a mixture of

<xsl:output method="xml">

and

<xsl:output method="html">

and

<xsl:output method="xhtml">

inside an include-hierarchy of XSL files. Now we want to refactor so all XSL files use the same output method.

In the end we want to produce XHTML-output so I suppose the latter would be the best choice.

But what are the differences between those three output-methods and which would you use for what kind of solution?

Edit: I'm using XSLT 2.0

like image 616
B.E. Avatar asked Dec 15 '09 09:12

B.E.


3 Answers

HTML will serialize as HTML, so the output may not be well-formed XML. If you are only sending to browsers and don't care about being able to parse as XML then that may work for you.

XML will serialize as XML, so the output will be well-formed, but you may run into some issues with browsers using the output. Small things, like self closing <script /> and <div /> elements. In order to avoid that issue you would have to play games, like adding comments inside of the element (e.g. <script src="someJSFile.js"><!--don't close my script tag --></script>)

If you have an XSLT 2.0 engine and want well formed HTML output without the headache of worrying about how some elements are serialized, then use XHTML.

like image 173
Mads Hansen Avatar answered Nov 18 '22 19:11

Mads Hansen


I found the answer by reading the XSLT 2.0 specification (XSLT 2.0 and XQuery 1.0 Serialization).

Given an empty instance of an XHTML element whose content model is not EMPTY (for example, an empty title or paragraph) the serializer MUST NOT use the minimized form. That is, it MUST output <p></p> and not <p />.

Given an XHTML element whose content model is EMPTY, the serializer MUST use the minimized tag syntax, for example <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing user agents. The serializer MUST include a space before the trailing />, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />.

The serializer MUST NOT use the entity reference &apos; which, although legal in XML and therefore in XHTML, is not defined in HTML and is not recognized by all HTML user agents.

The serializer SHOULD output namespace declarations in a way that is consistent with the requirements of the XHTML DTD if this is possible. The XHTML 1.0 DTDs require the declaration xmlns="http://www.w3.org/1999/xhtml" to appear on the html element, and only on the html element. The serializer MUST output namespace declarations that are consistent with the namespace nodes present in the result tree, but it MUST avoid outputting redundant namespace declarations on elements where the DTD would make them invalid.

That means the answer is using <xsl:output method="xhtml">.

like image 38
B.E. Avatar answered Nov 18 '22 18:11

B.E.


As far as I know, there is no method:xhtml to the xsl:output directive in xslt 1.0.

wc3schools agree with this.

As XHTML is an XML dialect, that's what I would use.

If, however you are using xslt 2.0, might as well use xhtml, since that's what you are outputting.

like image 4
Oded Avatar answered Nov 18 '22 20:11

Oded