Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT 2.0 Solution for Merging Sibling Elements With Same Name and Attribute Value

Tags:

xslt

xslt-2.0

I am looking for a solution that will turn

<p>
<hi rend="bold">aa</hi>
<hi rend="bold">bb</hi>
<hi rend="bold">cc</hi>
Perhaps some text.
<hi rend="italic">dd</hi>
<hi rend="italic">ee</hi>
Some more text.
<hi rend="italic">ff</hi>
<hi rend="italic">gg</hi>
Foo.
</p>

into

<p>
<hi rend="bold">aabbcc</hi>
Perhaps some text.
<hi rend="italic">ddee</hi>
Perhaps some text.
<hi rend="italic">ffgg</hi>
Foo. 
</p>

but my solution should _not hardcode elements and the names of the attribute values (italic, bold). The XSLT should really concatenate ALL sibling elements that have the same name and the same attribute value. Everything else should be left untouched.

I have looked at the solutions that already exist out there but none of them seemed to satisfy all of my requirements.

If anybody has a handy XSLT stylesheet for this, I'd be much obliged.

like image 923
Tench Avatar asked Feb 20 '23 00:02

Tench


2 Answers

This XSLT 2.0 style-sheet will merge adjacent elements with common rend attribute.

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*" />  

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()" />
  </xsl:copy>
</xsl:template>

<xsl:template match="*[*/@rend]">
  <xsl:copy>
    <xsl:apply-templates select="@*" />
    <xsl:for-each-group select="node()" group-adjacent="
       if (self::*/@rend) then
           concat( namespace-uri(), '|', local-name(), '|', @rend)
         else
           ''">
      <xsl:choose>
        <xsl:when test="current-grouping-key()" >
          <xsl:for-each select="current-group()[1]">
            <xsl:copy>
              <xsl:apply-templates select="@* | current-group()/node()" />
            </xsl:copy>
          </xsl:for-each>
        </xsl:when>
        <xsl:otherwise>
         <xsl:apply-templates select="current-group()" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

The advantages of this solution over Martin's are:

  • This merges over all parent elements, not just p elements.
  • Faster. Merging is accomplished over a single xsl:for-each instead of two nested xsl:for-each
  • The non-rend attributes of the head merge-able element are copied to the output.

Note also:

  • The test for pure white-space nodes, to be excluded for the purpose of determining "adjacent" elements with a common name and rend attribute value, is completely obviated by the xsl:strip-space instruction. Thus the xsl:for-each instruction if fairly simple and readable.
  • As an alternative to the group-adjacent attribute value, you could use instead ...

    <xsl:for-each-group select="node()" group-adjacent="
       string-join(for $x in self::*/@rend return
         concat( namespace-uri(), '|', local-name(), '|', @rend),'')">
    

    Use whichever form you personally find more readable.

like image 194
Sean B. Durkin Avatar answered Mar 22 '23 23:03

Sean B. Durkin


Is the name of that attribute (e.g. rend) known? In that case I think you want

<xsl:template match="p">
  <xsl:copy>
    <xsl:for-each-group select="*" group-adjacent="concat(node-name(.), '|', @rend)">
      <xsl:element name="{name()}" namespace="{namespace-uri()}">
         <xsl:copy-of select="@rend"/>
         <xsl:apply-templates select="current-group()/node()"/>
      </xsl:element>
     </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

[edit] If there can be text node with content between the elements, as you have shown in the edit of your input, then you need to nest to groupings as in the sample

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs">

<xsl:template match="p">
  <xsl:copy>
    <xsl:for-each-group select="node() except text()[not(normalize-space())]" group-adjacent="boolean(self::*)">
      <xsl:choose>
        <xsl:when test="current-grouping-key()">
          <xsl:for-each-group select="current-group()" group-by="concat(node-name(.), '|', @rend)">
            <xsl:element name="{name()}" namespace="{namespace-uri()}">
               <xsl:copy-of select="@rend"/>
               <xsl:apply-templates select="current-group()/node()"/>
            </xsl:element>
          </xsl:for-each-group>
        </xsl:when>
        <xsl:otherwise>
          <xsl:apply-templates select="current-group()"/>
        </xsl:otherwise>
      </xsl:choose>
     </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>
like image 27
Martin Honnen Avatar answered Mar 22 '23 23:03

Martin Honnen