Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove characters using xsl

Tags:

xml

xslt

xslt-1.0

I need to remove the following characters from a string value using xsl 1.0

*, /, \, #, %, !, @, $, (, ), &

I have come up with the following:

translate(translate(translate(string(//xpath/@value),'.',''),'/',''),',','')

In the above approach, I would have to duplicate the same code many times (one time per character).

How can I achieve the same goal without duplicating the code?

Thanks :-)

like image 917
dinesh028 Avatar asked Oct 26 '12 14:10

dinesh028


People also ask

How do I remove special characters from XSLT?

The inner translate( ) removes all characters of interest (e.g., numbers) to obtain a from string for the outer translate( ) , which removes these non-numeric characters from the original string.

How do I change the last character of a string in XSLT?

With the help of the XSLT2. 0 replace () function removes undesired characters from a string in powerful regular expressions. XSLT replaces corresponding single characters, not the entire string.

What is normalize space in XSLT?

The normalize-space function defines a technique used to control whitespace in a given text, and it is a part of the string function in Xpath. The term replaces a sequence of whitespace characters with a single space and returns the string element, and falls in an XPath Function.


1 Answers

The translate() function accepts as its second and third argument two strings -- not just two characters.

translate(., $string1, '')

produces a string which is the string value of the context (current) node in which any occurence of a character that is in $string1 is deleted.

Therefore you can use:

translate(expressionSelectingNode, "/\#%!@$()&", "")

to delete any of the characters contained in the second argument.

Of course, if the translate() function is used within an XSLT stylesheet (or, generally within an XML document), some special characters, such as < and & must be escaped respectively as &lt; and &amp;.

Using this is so powerful, that one can remove a set of unknown characters:

Imagine that you want to remove from any string all characters that are not numeric. We don't know in advance what characters would be present in the string, therefore we cannot just enumerate them in the second argument of translate(). However we can still delete all these unknown charcters like that:

translate(., translate(., '0123456789', ''), '')

The inner translate() produces the string sans any digits.

The outer translate() deletes all this non-digit characters (found by the inner translate()) from the original string -- therefore what remains are only the digit characters.

like image 64
Dimitre Novatchev Avatar answered Oct 14 '22 11:10

Dimitre Novatchev