I have an XML file which stores data. I am using an XSL to generate HTML files from that XML file. When I try to do that I get the error Illegal HTML character: decimal 150
I am not allowed to change the XML file. I have to map that one and many other illegal characters to a legal character (it can be any) in XSL. So it has to do that mapping in a generic way not only for one type of character.
You can define a character map that maps the characters not allowed to one allowed, for instance a space:
<xsl:output indent="yes" method="html" use-character-maps="m1"/>
<xsl:character-map name="m1">
<xsl:output-character character="–" string=" "/>
</xsl:character-map>
As an alternative, use a template replacing all illegal characters, according to http://www.w3.org/TR/xslt-xquery-serialization/#HTML_CHARDATA these are control characters #x7F-#x9F so using
<xsl:template match="text()">
<xsl:value-of select="replace(., '[-Ÿ]', ' ')"/>
</xsl:template>
should make sure those characters in text nodes in the input document are replaced by a spaces.
As another alternative, you could consider to output XHTML with elements in the XHTML namespaces and output method xhtml
.
Based on the list of characters, a full character map mapping all illegal control characters to a space is
<xsl:character-map
name="no-control-characters">
<xsl:output-character character="" string=" "/>
<xsl:output-character character="€" string=" "/>
<xsl:output-character character="" string=" "/>
<xsl:output-character character="‚" string=" "/>
<xsl:output-character character="ƒ" string=" "/>
<xsl:output-character character="„" string=" "/>
<xsl:output-character character="…" string=" "/>
<xsl:output-character character="†" string=" "/>
<xsl:output-character character="‡" string=" "/>
<xsl:output-character character="ˆ" string=" "/>
<xsl:output-character character="‰" string=" "/>
<xsl:output-character character="Š" string=" "/>
<xsl:output-character character="‹" string=" "/>
<xsl:output-character character="Œ" string=" "/>
<xsl:output-character character="" string=" "/>
<xsl:output-character character="Ž" string=" "/>
<xsl:output-character character="" string=" "/>
<xsl:output-character character="" string=" "/>
<xsl:output-character character="‘" string=" "/>
<xsl:output-character character="’" string=" "/>
<xsl:output-character character="“" string=" "/>
<xsl:output-character character="”" string=" "/>
<xsl:output-character character="•" string=" "/>
<xsl:output-character character="–" string=" "/>
<xsl:output-character character="—" string=" "/>
<xsl:output-character character="˜" string=" "/>
<xsl:output-character character="™" string=" "/>
<xsl:output-character character="š" string=" "/>
<xsl:output-character character="›" string=" "/>
<xsl:output-character character="œ" string=" "/>
<xsl:output-character character="" string=" "/>
<xsl:output-character character="ž" string=" "/>
<xsl:output-character character="Ÿ" string=" "/>
</xsl:character-map>
I generated that list with XSLT 2.0 and Saxon, using
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias"
exclude-result-prefixes="xs axsl">
<xsl:param name="start" as="xs:integer" select="127"/>
<xsl:param name="end" as="xs:integer" select="159"/>
<xsl:param name="replacement" as="xs:string" select="' '"/>
<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>
<xsl:output method="xml" indent="yes" use-character-maps="character-reference"/>
<xsl:character-map name="character-reference">
<xsl:output-character character="«" string="&"/>
</xsl:character-map>
<xsl:template name="main">
<axsl:character-map name="no-control-characters">
<xsl:for-each select="$start to $end">
<axsl:output-character character="«#{.};" string="{$replacement}"/>
</xsl:for-each>
</axsl:character-map>
</xsl:template>
</xsl:stylesheet>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With