I want to remove characters other than alphabets from a string in XSLT. For example
<Name>O'Niel</Name> = <Name>ONiel</Name>
<Name>St Peter</Name> = <Name>StPeter</Name>
<Name>A.David</Name> = <Name>ADavid</Name>
Can we use Regular Expression in XSLT to do this? Which is right way to implement this?
EDIT: This needs to done on XSLT 1.0.
I just created a function based on the code in this example...
<xsl:function name="lancet:stripSpecialChars">
<xsl:param name="string" />
<xsl:variable name="AllowedSymbols" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789()*%$#@!~<>,.?[]=- + /\ '"/>
<xsl:value-of select="
translate(
$string,
translate($string, $AllowedSymbols, ''),
' ')
"/>
</xsl:function>
and an example of the usage would be as follows:
<xsl:value-of select="lancet:stripSpecialChars($string)"/>
There is a pure XSLT way to do this.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:variable name="vAllowedSymbols"
select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'"/>
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="
translate(
.,
translate(., $vAllowedSymbols, ''),
''
)
"/>
</xsl:template>
</xsl:stylesheet>
Result against this sample:
<t>
<Name>O'Niel</Name>
<Name>St Peter</Name>
<Name>A.David</Name>
</t>
Will be:
<t>
<Name>ONiel</Name>
<Name>StPeter</Name>
<Name>ADavid</Name>
</t>
quickest way is <xsl:value-of select="translate(Name,translate(Name,'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',''),'')" />
the inner translate removes the alphabets (the needed characters). The result of that translate leaves other characters. the outer translate removes those unwanted characters
Here's a 2.0 option:
EDIT: Sorry...the 1.0 requirement was added after I started on my answer.
XML
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<Name>O'Niel</Name>
<Name>St Peter</Name>
<Name>A.David</Name>
</doc>
XSLT 2.0
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="*|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="replace(.,'[^a-zA-Z]','')"/>
</xsl:template>
</xsl:stylesheet>
Output
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<Name>ONiel</Name>
<Name>StPeter</Name>
<Name>ADavid</Name>
</doc>
Here are a couple more ways of using replace()
...
Using "i" (case-insensitive mode) flag:
replace(.,'[^A-Z]','','i')
Using category escapes:
replace(.,'\P{L}','')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With