Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing Duplicate Elements from my XML file by using an XSLT

Tags:

xml

xslt

Here is an example where I'd like to remove a duplicate entry if the ID is the same. I'm pulling hits from system 'A' and system 'B'. I want system 'A' to have precedence (i.e., if the ID is a duplicate, remove the element from system 'B'). Here's my example:

I am getting this result:

<HitList>
   <Hit System="A" ID="1"/>
   <Hit System="A" ID="2"/>
   <Hit System="A" ID="2"/>
   <Hit System="B" ID="1"/>
   <Hit System="B" ID="2"/>
   <Hit System="B" ID="3"/>
   <Hit System="B" ID="4"/>
</HitList>

I want this result (with the duplicates removed):

<HitList>
   <Hit System="A" ID="1"/>
   <Hit System="A" ID="2"/>
   <Hit System="B" ID="3"/>
   <Hit System="B" ID="4"/>
</HitList>

Current Code:

        <xsl:template match="/RetrievePersonSearchDataRequest">
                    <HitList>
                                <xsl:if test="string(RetrievePersonSearchDataRequest/SystemA/NamecheckResponse/@Status) = string(Succeeded)">
                                            <xsl:for-each select="SystemA/NamecheckResponse/BATCH/ITEMLIST/ITEM/VISQST/NCHITLIST/NCHIT">
                                                        <Hit>
                                                                    <xsl:attribute name="System"><xsl:text>A</xsl:text></xsl:attribute>
                                                                    <xsl:attribute name="PersonID"><xsl:value-of select="number(
                                                        REFUSAL/@UID)"/></xsl:attribute>
                                                        </Hit>
                                            </xsl:for-each>
                                </xsl:if>
                                <xsl:if test="string(RetrievePersonSearchDataRequest/SystemB/NamecheckResponse/@Status) = string(Succeeded)">
                                            <xsl:for-each select="SystemB/NamecheckResponse/PersonIDSearchResponse/personID">
                                                        <Hit>
                                                                    <xsl:attribute name="System"><xsl:text>B</xsl:text></xsl:attribute>
                                                                    <xsl:attribute name="PersonID"><xsl:value-of select="number(.)"/></xsl:attribute>
                                                        </Hit>
                                            </xsl:for-each>
                                </xsl:if>
                    </HitList>
        </xsl:template>

like image 450
jmac Avatar asked Jan 14 '23 01:01

jmac


2 Answers

Here is an efficient XSLT 1.0 solution using keys:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kHitById" match="Hit" use="@ID"/>
 <xsl:key name="kHitAById" match="Hit[@System = 'A']" use="@ID"/>

 <xsl:template match=
  "Hit[generate-id() = generate-id(key('kHitById',@ID)[1])]">

  <xsl:copy-of select=
  "key('kHitAById', @ID)[1]|current()[not(key('kHitAById', @ID))]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document (intentionally adapted from the provided one, to make it more interesting by placing some Bs before the corresponding As):

<HitList>
   <Hit System="B" ID="1"/>
   <Hit System="A" ID="1"/>
   <Hit System="B" ID="2"/>
   <Hit System="A" ID="2"/>
   <Hit System="A" ID="2"/>
   <Hit System="B" ID="3"/>
   <Hit System="B" ID="4"/>
</HitList>

the wanted, correct result is produced:

<Hit System="A" ID="1"/>
<Hit System="A" ID="2"/>
<Hit System="B" ID="3"/>
<Hit System="B" ID="4"/>
like image 96
Dimitre Novatchev Avatar answered Jan 17 '23 06:01

Dimitre Novatchev


This can be done with a single override of the identity template...

XML Input

<HitList>
    <Hit System="A" ID="1"/>
    <Hit System="A" ID="2"/>
    <Hit System="A" ID="2"/>
    <Hit System="B" ID="1"/>
    <Hit System="B" ID="2"/>
    <Hit System="B" ID="3"/>
    <Hit System="B" ID="4"/>
</HitList>

XSLT 1.0

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="Hit[(@System='B' and @ID=../Hit[@System='A']/@ID) or 
        @ID = preceding-sibling::Hit[@System='A']/@ID]"/>

</xsl:stylesheet>

Output

<HitList>
   <Hit System="A" ID="1"/>
   <Hit System="A" ID="2"/>
   <Hit System="B" ID="3"/>
   <Hit System="B" ID="4"/>
</HitList>
like image 45
Daniel Haley Avatar answered Jan 17 '23 08:01

Daniel Haley