I have something like this:
<node TEXT=" txt A "/>
<node TEXT="
txt X
"/>
<node>
<html>
<p>
txt Y
</p>
</html>
</node>
<node TEXT="txt B"/>
and i want to use XSLT to get this:
txt A
txt X
txt Y
txt B
I want to strip all useless whitespaces and linebreaks of @TEXT's and CDATA's. The only XML-input that is giving structure to the output are the <node>
-tags.
Remove All Line Breaks from a String We can remove all line breaks by using a regex to match all the line breaks by writing: str = str. replace(/(\r\n|\n|\r)/gm, ""); \r\n is the CRLF line break used by Windows.
The following transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="*">
<xsl:apply-templates select="@TEXT | node()"/>
</xsl:template>
<xsl:template match="node/@TEXT | text()">
<xsl:if test="normalize-space(.)">
<xsl:value-of select=
"concat(normalize-space(.), '
')"/>
</xsl:if>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
when applied against this XML document
<t>
<node TEXT=" txt A "/>
<node TEXT=" txt X"/>
<node>
<html>
<p> txt Y </p>
</html>
</node>
<node TEXT="txt B"/>
</t>
produces the wanted result:
txt A
txt X
txt Y
txt B
Do note the use of the standard XPath function normalize-space(), which strips off all leading and trailing spaces and replaces every sequence of other spaces with just one space.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With