I have a series of documents output by a Java application that exports XML with html tags unescaped for example as
<b>some text</b>
( I cannot change this behaviour).
The app that then uses this output must have all html tags escaped to
<b>some text </b>
I use the xslt below to escape the tags but not surprisingly it does not work for nested html tags, for example where there's
<u><b>A string of html</b></u>
Upon XSLT transform I get
<u>a string of html</u>
where nested <b> and </b> tags get removed altogether.
I am looking to achieve
<u><b>A string of html</b></u>
I am sure there's an easy answer to this by adjusting the value-of select or the template but I have tried and failed dismally
Any help would be much appreciated!
Sample doc with embedded html tags
<?xml version="1.0" encoding="UTF-8"?>
<Main>
<Text><u><b>A string of html</b></u></Text>
</Main>
This is the XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="no" encoding="UTF-8"/>
<xsl:strip-space elements="*" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Text/*">
<xsl:value-of select="concat('<',name(),'>',.,'</',name(),'>')" />
</xsl:template>
</xsl:stylesheet>
Which produces
<?xml version="1.0" encoding="UTF-8"?>
<Main>
<Text><u>A string of html</u></Text>
</Main>
The inner bold tags have been dropped as you can see.
Can anyone help with adjusting the xslt?
Thank you :-)
Try changing your current Text/*
template to this
<xsl:template match="Text//*">
<xsl:value-of select="concat('<',name(),'>')" />
<xsl:apply-templates />
<xsl:value-of select="concat('</',name(),'>')" />
</xsl:template>
So, the Text//*
will match any descendant element of the Text element, not just the immediate child. You then output the opening and closing templates separately, and in between these you recursively call the template to process the 'nested' elements.
When applied to your sample XML, the following should be output
<Main>
<Text><u><b>A string of html</b></u></Text>
</Main>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With