Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need XSLT transform to remove duplicate elements - sorted by an attribute

I have a terrible piece of XML that I need to process through BizTalk, and I have managed to normalise it into this example below. I am no XSLT ninja, but between the web and the VS2010 debugger, I can find my way around XSL.

I now need a clever bit of XSLT to "weed out" the duplicate elements and only keep the latest ones, as decided by the date in the ValidFromDate attribute.

The ValidFromDate attribute is of the XSD:Date type.

<SomeData>
  <A ValidFromDate="2011-12-01">A_1</A>
  <A ValidFromDate="2012-01-19">A_2</A>
  <B CalidFromDate="2011-12-03">B_1</B>
  <B ValidFromDate="2012-01-17">B_2</B>
  <B ValidFromDate="2012-01-19">B_3</B>
  <C ValidFromDate="2012-01-20">C_1</C>
  <C ValidFromDate="2011-01-20">C_2</C>
</SomeData>

After a transformation I'd like to only keep these lines:

<SomeData>
  <A ValidFromDate="2012-01-19">A_2</A>
  <B ValidFromDate="2012-01-19">B_3</B>
  <C ValidFromDate="2012-01-20">C_1</C>
</SomeData>

Any clues as to how I put that XSL together? I've emptied the internet trying to look for a solution, and I have tried a lot of clever XSL sorting scripts, but none I felt took me in the right direction.

like image 912
LarsWA Avatar asked Jan 26 '12 18:01

LarsWA


People also ask

What is the purpose of terminate attribute in xsl message?

The terminate attribute gives you the choice to either quit or continue the processing when an error occurs.

What is Number () in XSLT?

Specifies the format pattern. Here are some of the characters used in the formatting pattern: 0 (Digit)


1 Answers

The optimal solution for this problem with Xslt 1.0 would be to use Muenchian grouping. (Given that the elements are already sorted by the ValidFromDate attribute) the following stylesheet should do the trick:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

  <xsl:key name="element-key" match="/SomeData/*" use="name()" />

  <xsl:template match="/SomeData">
    <xsl:copy>
      <xsl:for-each select="*[generate-id() = generate-id(key('element-key', name()))]">
        <xsl:copy-of select="(. | following-sibling::*[name() = name(current())])[last()]" />
      </xsl:for-each>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

Here is the result I got when running it against your sample Xml:

<?xml version="1.0" encoding="utf-8"?>
<SomeData>
  <A ValidFromDate="2012-01-19">A_2</A>
  <B ValidFromDate="2012-01-19">B_3</B>
  <C ValidFromDate="2011-01-20">C_2</C>
</SomeData>
like image 62
Pawel Avatar answered Sep 29 '22 01:09

Pawel