Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XPath last occurrence of each element

Tags:

xml

xslt

xpath

I have XML like

<root>
    <a>One</a>
    <a>Two</a>
    <b>Three</b>
    <c>Four</c>
    <a>Five</a>
    <b>
        <a>Six</a>
    </b>
</root>

and need to select the last occurrence of any child node name in root. In this case, the desired resulting list would be:

<c>Four</c>
<a>Five</a>
<b>
    <a>Six</a>
</b>

Any help is appreciated!

like image 912
JJones56 Avatar asked Jul 05 '11 09:07

JJones56


3 Answers

Both the XPath 2.0 solution and the currently accepted answer are very inefficient (O(N^2)).

This solution has sublinear complexity:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kElemsByName" match="/*/*"
  use="name()"/>

 <xsl:template match="/">
  <xsl:copy-of select=
    "/*/*[generate-id()
         =
          generate-id(key('kElemsByName', name())[last()])
         ]"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<root>
    <a>One</a>
    <a>Two</a>
    <b>Three</b>
    <c>Four</c>
    <a>Five</a>
    <b>
        <a>Six</a>
    </b>
</root>

the wanted, correct result is produced:

<c>Four</c>
<a>Five</a>
<b>
   <a>Six</a>
</b>

Explanation: This is a modified variant of Muenchian grouping -- so that not the first. but the last node in each group is processed.

II XPath 2.0 one-liner:

Use:

/*/*[index-of(/*/*/name(), name())[last()]]

Verification using XSLT 2.0 as the XPath 2.0 host:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:sequence select=
    "/*/*[index-of(/*/*/name(), name())[last()]]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the same XML document (provided earlier), the same correct result is produced:

<c>Four</c>
<a>Five</a>
<b>
    <a>Six</a>
</b>
like image 77
Dimitre Novatchev Avatar answered Nov 15 '22 11:11

Dimitre Novatchev


If you can you XPath 2.0 this will work

/root//*[not(name() = following-sibling::*/name())]
like image 43
cordsen Avatar answered Nov 15 '22 10:11

cordsen


XSLT based solution:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="root/*">
        <xsl:variable name="n" select="name()"/>
        <xsl:copy-of
            select=".[not(following-sibling::node()[name()=$n])]"/>
    </xsl:template>
</xsl:stylesheet>

Produced output:

<c>Four</c>
<a>Five</a>
<b>
   <a>Six</a>
</b>

Second solution (you can use it as single XPath expression):

<xsl:template match="/root">
    <xsl:copy-of select="a[not(./following-sibling::a)]
        | b[not(./following-sibling::b)]
        | c[not(./following-sibling::c)]"/>
</xsl:template>
like image 44
Grzegorz Szpetkowski Avatar answered Nov 15 '22 12:11

Grzegorz Szpetkowski