Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to output duplicate elements using XSLT?

Tags:

xml

xslt

I have xml which looks something like this -

<Root>
  <Fields>
    <Field name="abc" displayName="aaa" />
    <Field name="pqr" displayName="ppp" />
    <Field name="abc" displayName="aaa" />
    <Field name="xyz" displayName="zzz" />
  </Fields>
</Root>

I want the output to contain only those elements which have a repeating name-displayName combination, if there are any -

<Root>
      <Fields>
        <Field name="abc" displayName="aaa" />
        <Field name="abc" displayName="aaa" />
      </Fields>
</Root>

How can I do this using XSLT?

like image 374
Unmesh Kondolikar Avatar asked May 09 '11 12:05

Unmesh Kondolikar


2 Answers

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kFieldByName" match="Field"
  use="concat(@name, '+', @displayName)"/>

 <xsl:template match=
  "Field[generate-id()
        =
         generate-id(key('kFieldByName',
                     concat(@name, '+', @displayName)
                     )[2])
        ]
  ">
     <xsl:copy-of select=
     "key('kFieldByName',concat(@name, '+', @displayName))"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<Root>
    <Fields>
        <Field name="abc" displayName="aaa" />
        <Field name="pqr" displayName="ppp" />
        <Field name="abc" displayName="aaa" />
        <Field name="xyz" displayName="zzz" />
    </Fields>
</Root>

produces the wanted result:

<Field name="abc" displayName="aaa"/>
<Field name="abc" displayName="aaa"/>

Explanation:

  1. Muenchian grouping using composite key (on the name and displayName attributes).

  2. The only template in the code matches any Field element that is the second in its corresponding group. Then, inside the body of the template, the whole group is output.

  3. Muenchian grouping is the efficient way to do grouping in XSLT 1.0. Keys are used for efficiency.

  4. See also my answer to this question.

II. XSLT 2.0 solution:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
     <xsl:for-each-group select="/*/*/Field"
          group-by="concat(@name, '+', @displayName)">
       <xsl:sequence select="current-group()[current-group()[2]]"/>
   </xsl:for-each-group>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document (shown above), again the wanted, correct result is produced:

<Field name="abc" displayName="aaa"/>
<Field name="abc" displayName="aaa"/>

Explanation:

  1. Use of <xsl:for-each-group>

  2. Use of the current-group() function.

like image 174
Dimitre Novatchev Avatar answered Nov 15 '22 08:11

Dimitre Novatchev


To find duplicates, you need to iterate the Field elements and for each one, look for the set of Field elements in the whole document that have matching name and displayName attribute values. If the set has more than 1 element, you add that element into the output.

Here is an example of a template that achieves this:

<xsl:template match="Field">
    <xsl:variable name="fieldName" select="@name" />
    <xsl:variable name="fieldDisplayName" select="@displayName" />
    <xsl:if test="count(//Field[@name=$fieldName and @displayName=$fieldDisplayName]) > 1">
        <xsl:copy-of select="."/>
    </xsl:if>
</xsl:template>

Executing this template (wrapped in an appropriate XSLT file) on your sample data gives the following output:

<?xml version="1.0" encoding="utf-8"?>
<Root>
  <Fields>
    <Field name="abc" displayName="aaa" />
    <Field name="abc" displayName="aaa" />
  </Fields>
</Root>
like image 38
Jeff Yates Avatar answered Nov 15 '22 08:11

Jeff Yates