Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT: Finding last occurence in a string

Given a form number like:

ABC_12345_Q-10

I want to end up with:

ABC12345

So I need to find the position of the second underscore

Note that there is no standard pattern or length to any of the "sections" between the underscores (so I cannot use substring to simply eliminate the last section).

xPath 2.0 solutions are okay.

like image 332
johkar Avatar asked Jun 29 '10 14:06

johkar


People also ask

How to get substring in XSLT?

You can easily use tokenize(., ' ') in the context of the node element to get a sequence of the strings, then you can use the substring function for the first letter e.g. in XSLT 3 tokenize(., ' ')! substring(., 1, 1) or in XSLT 2 for $token in tokenize(., ' ') return substring($token, 1, 1) .

What is Number () in XSLT?

Specifies the format pattern. Here are some of the characters used in the formatting pattern: 0 (Digit)

What is substring after in XSLT?

substring-after() Function — Returns the substring of the first argument after the first occurrence of the second argument in the first argument. If the second argument does not occur in the first argument, the substring-after() function returns an empty string.

What is current group () in XSLT?

Returns the contents of the current group selected by xsl:for-each-group. Available in XSLT 2.0 and later versions. Available in all Saxon editions. current-group() ➔ item()*


3 Answers

@Pavel_Minaev has provided XPath 1.0 amd XPath 2.0 solutions that work if it is known in advance that the number of underscores is 2.

Here are solutions for the more difficult problem, where the number of undrscores isn't statically known (may be any number):

XPath 2.0:

translate(substring($s,
                    1, 
                    index-of(string-to-codepoints($s), 
                             string-to-codepoints('_')
                             )[last()] -1
                   ),
          '_',
          ''
         )

XSLT 1.0:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 >
 <xsl:output method="text"/>

  <xsl:variable name="s" select="'ABC_12345_Q-10'"/>
  <xsl:template match="/">
    <xsl:call-template name="stripLast">
     <xsl:with-param name="pText" select="$s"/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="stripLast">
    <xsl:param name="pText"/>
    <xsl:param name="pDelim" select="'_'"/>

     <xsl:if test="contains($pText, $pDelim)">
       <xsl:value-of select="substring-before($pText, $pDelim)"/>
       <xsl:call-template name="stripLast">
         <xsl:with-param name="pText" select=
          "substring-after($pText, $pDelim)"/>
         <xsl:with-param name="pDelim" select="$pDelim"/>
       </xsl:call-template>
     </xsl:if>
   </xsl:template>
</xsl:stylesheet>

when this transformation is applied to any XML document (not used), the desired, correct result is produced:

ABC12345
like image 131
Dimitre Novatchev Avatar answered Oct 24 '22 11:10

Dimitre Novatchev


Easier solution in XSLT 2.0:

codepoints-to-string(reverse(string-to-codepoints(
    substring-before(
        codepoints-to-string(reverse(string-to-codepoints($s))), '_'))))

With 'substring-before' you will get everything after the last occurrence of your delimiter (the underscore). If you use 'substring-after' instead, you will get everything before the last occurrence of your deliminator.

like image 40
omoroca Avatar answered Oct 24 '22 12:10

omoroca


concat(
    substring-before($s, '_'),
    substring-before(substring-after($s, '_'), '_')
)

Alternatively:

string-join(tokenize($s, '_')[position() <= 2], '')
like image 45
Pavel Minaev Avatar answered Oct 24 '22 10:10

Pavel Minaev