Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to transform & nbsp; in XSLT

Tags:

html

cdata

xslt

I have a following xslt:

<span><xsl:text disable-output-escaping="yes"><![CDATA[&nbsp;Some text]]></xsl:text></span>

After transformation I get:

<span>&amp;nbsp;Some text</span>

which is rendered as: & nbsp;Some text

I want to render & nbsp; as space character. I have tried also change disable-output-escaping to no, but it didn't helped.

thanks for help.

like image 487
zosim Avatar asked Nov 24 '11 13:11

zosim


People also ask

How do you know you are transformed?

You Feel Confused, Anxious & Full of Mind Chatter The process of transformation cannot be understood by the mind. It is a process guided by something much deeper and it falls outside the realms of logic and reason. The more you try to understand what is happening, the more confused you will become.

Is transformation possible?

The theme of transformation runs through the depth psychotherapy of C.G. Jung. Jung's psychology of change most definitely does affirm that transformation is possible. However, it most often does not commence as the result of an ego-driven self improvement project.


3 Answers

The other two answers are correct, but I decided to take a little broader view to this subject.

What everyone should know about CDATA sections

CDATA section is just an alternative serialization form to an escaped XML string. This means that parser produces the same result for <span><![CDATA[ a & b < 2 ]]></span> and <span> a &amp; b &lt; 2 </span>. XML applications work on the parsed data, so an XML application should produce the same output for both example input elements.

Briefly: escaped data and un-escaped data inside a CDATA section mean exactly the same.

In this case

<span><xsl:text disable-output-escaping="yes"><![CDATA[&nbsp;Some text]]></xsl:text></span>

is identical to

<span><xsl:text disable-output-escaping="yes">&amp;nbsp;Some text</xsl:text></span>

Note that the & character has been escaped to &amp; in the latter serialization form.

What everyone should know about disable-output-escaping

disable-output-escaping is a feature that concerns the serialization only. In order to maintain the well-formedness of the serialized XML, XSLT processors escape & and < (and possibly other characters) by using entities. Their escaped forms are &amp; and &lt;. Escaped or not, the XML data is the same. XSLT elements <xsl:value-of> and <xsl:text> can have a disable-output-escaping attribute but it is generally advised to avoid using this feature. Reasons for this are:

  • XSLT processor may produce only a result tree, which is passed on to another process without serializing it between the processes. In such case disabling output escaping will fail because the XSLT processor is not able to control the serialization of the result tree.
  • An XSLT processor is not required to support disable-output-escaping attribute. In such case the processor must escape the output (or it may raise an error) so again, disabling output escaping will fail.
  • An XSLT processor must escape characters that cannot be represented as such in the encoding that is used for the document output. Using disable-output-escaping on such characters will result in error or escaped text so again, disabling output escaping will fail.
  • Disabling output escaping will easily lead to malformed or invalid XML so using it requires great attention or post processing of the output with non-XML tools.
  • disable-output-escaping is often misunderstood and misused and the same result could be achieved with more regular ways e.g. creating new elements as literals or with <xsl:element>.

In this case

<span><xsl:text disable-output-escaping="yes"><![CDATA[&nbsp;Some text]]></xsl:text></span>

should output

<span>&nbsp;Some text</span>

but the & character got escaped instead, so in this case the output escaping seems to fail.

What everyone should know about using entities

If an XML document contains an entity reference, the entity must be declared, if not, the document is not valid. XML has only 5 pre-defined entities. They are:

  • &amp; for &
  • &lt; for <
  • &gt; for >
  • &quot; for "
  • &apos; for '

All other entity references must be defined either in an internal DTD of the document or in an external DTD that the document refers to (directly or indirectly). Therefore blindly adding entity references to an XML document might result in invalid documents. Documents with (X)HTML DOCTYPE can use several entities (like &nbsp;) because the XHTML DTD refers to a DTD that contains their definition. The entities are defined in these three DTDs: http://www.w3.org/TR/html4/HTMLlat1.ent , http://www.w3.org/TR/html4/HTMLsymbol.ent and http://www.w3.org/TR/html4/HTMLspecial.ent .

An entity reference does not always get replaced with its replacement text. This could happen for example if the parser has no net connection to retrieve the DTD. Also non-validating parsers do not need to include the replacement text. In such cases the data represented by the entity is "lost". If the entity gets replacement works, there will be no signs in the parsed data model that the XML serialization had any entity references at all. The data model will be the same if one uses entities or their replacement values. Briefly: entities are only an alternative way to represent the replacement text of the entity reference.

In this case the replacement text of &nbsp; is &#160; (which is same than &#xA0; using hexadecimal notation). Instead of trying to output the &nbsp; entity, it will be easier and more robust to just use the solution suggested by @phihag. If you like the readability of the &nbsp; entity you can follow the solution suggested by @Michael Krelin and define that entity in an internal DTD. After that, you can use it directly within your XSLT code.

Do note that in both cases the XSLT processor will output the literal non-breaking space character and not the &nbsp; entity reference or the &#160; character reference. Creating such references manually with XSLT 1.0 requires the usage of disable-output-escaping feature, which has its own problems as stated above.

like image 129
jasso Avatar answered Oct 19 '22 21:10

jasso


I think you should use &#xa0;, because &nbsp; entity is likely to be not defined. And no CDATA.

One more possibility is to define nbsp entity for your xsl file:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE xsl:stylesheet [
 <!ENTITY nbsp "&#xa0;">
]>
<xsl:stylesheet version="1.0" …
like image 26
Michael Krelin - hacker Avatar answered Oct 19 '22 23:10

Michael Krelin - hacker


In CDATA, all values are literal. You want:

<span><xsl:text>&#160;Some text</xsl:text></span>
like image 1
phihag Avatar answered Oct 19 '22 23:10

phihag