Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unique xml nodes based on attribute

Tags:

xml

xslt

I want to use XSLT to transform a set of documents into one structure. I have the transformation working correctly to concatenate the documents. I don't know, however, whether the the documents have duplicate entries in them, which I will need to remove.

I need to know how to remove these duplicates (if they exist) by an id attribute. All duplicates will have the same id. I know it will have something to do with keys and generate-id functions.

<root>
    <item id="1001">A</item>
    <item id="1003">C</item>
    <item id="1004">D</item>
    <item id="1002">B</item>
    <item id="1001">A</item>
    <item id="1003">C</item>
    <item id="1004">D</item>
    <item id="1005">E</item>
</root>

I need an XSLT 1.0 transformation for the above, based on the following...

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

Also, would someone be able to explain how it works to me too? Bit of a noob...

Thanks in advance...

like image 558
designermonkey Avatar asked Jun 17 '26 08:06

designermonkey


1 Answers

Commonly solutions are presented with the use of generate-id() but personally I prefer a slightly different variation that doesn't use generate-id:-

<xsl:key name="items" match="item" use="@id" />

<xsl:template match="root">
    <root>
        <xsl:copy-of select="item[count(key('items',@id)[1]|.)=1]" />
    </root>
</xsl:template>

First you create a key which holds the all item elements using the id attribute as the lookup key. key generates an efficient index which can be used to look up items.

The technique relies on the fact that when create a node-set using the | operator you get a unique set of nodes. In other words if the same node is found on both sides of the | operator it only appears in the resulting set once.

The expression:-

 key('items',@id)

Will return the set of item nodes that have a specific ID. So:-

 key('items',@id)[1]

will return only one of the nodes that were found have that specific ID and is repeatable (that is using this expression repeatedly always returns the same node).

Hence the expression:-

 count(key('items',@id)[1]|.)=1

is can only be true for one item node with a specific id value.

The copy-of therefore makes a deep copy of only one item node having a distinct id.

like image 82
AnthonyWJones Avatar answered Jun 20 '26 13:06

AnthonyWJones



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!