Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT, Extract sub-string from xml using regex

Tags:

regex

xslt

I am trying to apply XSLT on SVN log, I need to extract bug numbers from the commit messages. I am applying this regex on msg but get nothing back. What am I missing in XSLT? Thank you in advance Below is XML that I get from SVN:

<?xml version="1.0" encoding="UTF-8"?>
<log>
	<logentry revision="265">
	<author>dre</author>
    <date>2015-04-13T02:35:25.246150Z</date>
    <msg>modified code</msg>
</logentry>
<logentry revision="73283">
	<author>john</author>
	<date>2015-04-13T14:10:20.987159Z</date>
	<msg>fixed bug DESK-1868</msg>
</logentry>
<logentry revision="73290">
	<author>ron</author>
	<date>2015-04-13T14:24:57.475711Z</date>
	<msg>WEBAPP-1868 Fix for pallete list and settings dialog Selected Tab Index</msg>
</logentry>
</log>

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
  <html>
  <body>
  <h2>SVN Issues</h2>
    <table border="1">
      <tr bgcolor="#9acd32">
        <th style="text-align:left">ver</th>
        <th style="text-align:left">author</th>
        <th style="text-align:left">date</th>
        <th style="text-align:left">ticket</th>
      </tr>
      <xsl:for-each select="log/logentry">
      <tr>
        <td><xsl:value-of select="@revision"/></td>
        <td><xsl:value-of select="author"/></td>
        <td><xsl:value-of select="date"/></td>
        <td>
            
                <xsl:variable name="messageValue" select="msg"/>
                <xsl:analyze-string select="$messageValue" 
                  regex="(DESK|TRS|PEK|WEBAPP)-\d{4}$">
                      <xsl:matching-substring>
                         <bug><xsl:value-of select="regex-group(1)"/></bug>
                      </xsl:matching-substring>
                </xsl:analyze-string>
        </td>
      </tr>
      </xsl:for-each>
    </table>
  </body>
  </html>
</xsl:template>
</xsl:stylesheet>
like image 740
Pet Mor Avatar asked Sep 02 '25 13:09

Pet Mor


1 Answers

  1. http://www.w3.org/TR/xslt20/#analyze-string

    Note: Because the regex attribute is an attribute value template, curly brackets within the regular expression must be doubled. For example, to match a sequence of one to five characters, write regex=".{{1,5}}". For regular expressions containing many curly brackets it may be more convenient to use a notation such as regex="{'[0-9]{1,5}[a-z]{3}[0-9]{1,2}'}", or to use a variable.

  2. You do not want to anchor your expression to the end of the line using $ at the end of your expression. Otherwise the regex will only match when the message ends with an issue ID.

Use this regex expression to capture the entire bug number:

regex="((DESK|TRS|PEK|WEBAPP)-\d{{4}})"
like image 57
2 revsMads Hansen Avatar answered Sep 05 '25 06:09

2 revsMads Hansen