Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pattern match in xslt

Tags:

xml

xslt

xpath

I have the following xml

<xml>
    <para>
       <number>1</number>
       <text> Paragraph 1(<italic>A</italic>) is this para.</text>
    </para>
</xml>

I want to match the text element if i found a pattern starting with word Paragraph followed by space followed by one or more digit followed by "(" followed by node italic and digit and closing ")". Then it should put a anchor tag around it. so output of above xml should be

 <xml>
    <para>
       <number>1</number>
       <text> <a href="Paragraph1(A)">Paragraph 1(<italic>A</italic>)</a> is this para.</text>
    </para>
</xml>

i.e replace Paragraph 1(<italic>A</italic>) with a tag and href value should be matched text without any spaces and italic node.

Any help or hint how to handle in regex...

like image 740
atif Avatar asked Feb 14 '26 03:02

atif


2 Answers

This XSLT 2.0 stylesheet produces the desired result:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output omit-xml-declaration="no" indent="yes"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!-- Only our text element requires special handling here....-->
    <xsl:template match="text[matches(.,'Paragraph\s+\d*')]">
        <xsl:copy>
            <xsl:variable name="textElement" select="."/>
            <xsl:analyze-string select="." regex="(Paragraph\s+\d*)(\(.*\))">
                <xsl:matching-substring>
                    <a href="{concat(replace(regex-group(1),'\s',''),regex-group(2))}">
                        <xsl:apply-templates select="$textElement/node()"/>
                    </a>
                </xsl:matching-substring>
            </xsl:analyze-string>       
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>
like image 184
Mads Hansen Avatar answered Feb 16 '26 21:02

Mads Hansen


This can give you an idea on how you could solve it:

<?xml version="1.0"?>

<xsl:template match="/">
    <xsl:apply-templates/>
</xsl:template>

    <!-- Only our text element requires special handling here....-->
<xsl:template match="text">
    <xsl:copy>
        <xsl:choose>
            <xsl:when test="matches(.,'Paragraph\s+\d*')">
                <!-- Save original text value here -->
                <xsl:variable name="temp" select="."/>
                <!-- Save the value of <italic>x</italic> child element -->
                <xsl:variable name="italic_val" select="italic/text()"/>
                <xsl:analyze-string select="." regex="(Paragraph\s+\d*)">
                    <xsl:matching-substring>
                        <xsl:element name="a">
                            <xsl:attribute name="href">
                                <xsl:value-of select="concat(replace(regex-group(1),'\s',''),'(',$italic_val,')')"/>
                            </xsl:attribute>
                            <xsl:value-of select="$temp"/>
                        </xsl:element>
                    </xsl:matching-substring>
                </xsl:analyze-string>

            </xsl:when>
            <xsl:otherwise>DOESNT MATCH</xsl:otherwise>
        </xsl:choose>
    </xsl:copy>
</xsl:template>

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

It basically uses the XSLT identity template to copy the original doc and defines a template to handle <text> element. There it analyzes its Text() content and for the appropriate Regex: Paragraph . If it finds that it generates the anchor sub-structure. For that I use some temporary variables.

Here my output file:

<xml>
  <para>
    <number>1</number>
    <text><a href="Paragraph1(A)"> Paragraph 1(A) is this para.</a></text>
  </para>
</xml>

I'm still missing the Paragraph 1(<italic>A</italic>) instead of what I'm getting: Paragraph 1(A) but that's just some tweaking...

Take a look at this link It may help you understand Regex in XSLT

Notice it uses XSLT 2.0

like image 44
Adolfo Perez Avatar answered Feb 16 '26 20:02

Adolfo Perez



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!