Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace special characters in XSLT

Tags:

string

xslt

I want to remove characters other than alphabets from a string in XSLT. For example

<Name>O'Niel</Name> = <Name>ONiel</Name>
<Name>St Peter</Name> = <Name>StPeter</Name>
<Name>A.David</Name> = <Name>ADavid</Name>

Can we use Regular Expression in XSLT to do this? Which is right way to implement this?

EDIT: This needs to done on XSLT 1.0.

like image 932
Amzath Avatar asked Feb 22 '11 21:02

Amzath


4 Answers

I just created a function based on the code in this example...

    <xsl:function name="lancet:stripSpecialChars">
    <xsl:param name="string" />
    <xsl:variable name="AllowedSymbols" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789()*%$#@!~&lt;&gt;,.?[]=- +   /\ '"/>
    <xsl:value-of select="
        translate(
            $string,
            translate($string, $AllowedSymbols, ''),
            ' ')
        "/>
</xsl:function> 

and an example of the usage would be as follows:

<xsl:value-of select="lancet:stripSpecialChars($string)"/>
like image 156
James Christopher Fourie Avatar answered Sep 21 '22 10:09

James Christopher Fourie


There is a pure XSLT way to do this.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:variable name="vAllowedSymbols"
        select="'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'"/>
    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="text()">
        <xsl:value-of select="
            translate(
                .,
                translate(., $vAllowedSymbols, ''),
                ''
                )
            "/>
    </xsl:template>
</xsl:stylesheet>

Result against this sample:

<t>
    <Name>O'Niel</Name>
    <Name>St Peter</Name>
    <Name>A.David</Name>
</t>

Will be:

<t>
    <Name>ONiel</Name>
    <Name>StPeter</Name>
    <Name>ADavid</Name>
</t>
like image 39
Flack Avatar answered Sep 18 '22 10:09

Flack


quickest way is <xsl:value-of select="translate(Name,translate(Name,'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',''),'')" />

the inner translate removes the alphabets (the needed characters). The result of that translate leaves other characters. the outer translate removes those unwanted characters

like image 28
nagarayan Avatar answered Sep 21 '22 10:09

nagarayan


Here's a 2.0 option:

EDIT: Sorry...the 1.0 requirement was added after I started on my answer.

XML

<?xml version="1.0" encoding="UTF-8"?>
<doc>
  <Name>O'Niel</Name>
  <Name>St Peter</Name>
  <Name>A.David</Name>
</doc>

XSLT 2.0

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="*|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="text()">
    <xsl:value-of select="replace(.,'[^a-zA-Z]','')"/>
  </xsl:template>

</xsl:stylesheet>

Output

<?xml version="1.0" encoding="UTF-8"?>
<doc>
   <Name>ONiel</Name>
   <Name>StPeter</Name>
   <Name>ADavid</Name>
</doc>

Here are a couple more ways of using replace()...

Using "i" (case-insensitive mode) flag:

replace(.,'[^A-Z]','','i')

Using category escapes:

replace(.,'\P{L}','')
like image 29
Daniel Haley Avatar answered Sep 21 '22 10:09

Daniel Haley