Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting by ID and then by timestamp within the same node

I have a very particular issue regarding sorting with XSL 1.0 (and only 1.0 - I'm using .Net Parser).

Here is my xml :

<Root>
....
<PatientsPN>
        <Patient>
            <ID>1</ID>
            <TimeStamp>20111208165819</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>4</ID>
            <TimeStamp>20111208165910</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny4</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>4</ID>
            <TimeStamp>20111208165902</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>FannyMOI</PrenomPatient>
            <Sexe>M</Sexe>
        </Patient>
        <Patient>
            <ID>2</ID>
            <TimeStamp>20111208170000</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>FannyMOI</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>2</ID>
            <TimeStamp>20111208165819</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>2</ID>
            <TimeStamp>20111208170050</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Cmoi2</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>3</ID>
            <TimeStamp>20111208165829</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Jesuis3</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
    </PatientsPN>
...
</Root>

I would like to sort my PatientsNP first by ID and then take the higher TimeStamp of each ID. My output :

<Root>
<PatientsPN>
 <Patient>
            <ID>1</ID>
            <TimeStamp>20111208165819</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
<Patient>
            <ID>2</ID>
            <TimeStamp>20111208170050</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Cmoi2</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
<Patient>
            <ID>3</ID>
            <TimeStamp>20111208165829</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Jesuis3</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
<Patient>
            <ID>4</ID>
            <TimeStamp>20111208165910</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny4</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
</PatientsPN>
</Root>

First, I tried to sort my list by ID and then parsing through each node and use Xpath to extract the higher timestamp but that didn't work. It kept repeating the other nodes.

Also tried Muench sorting method but I couldn't make it work properly with something more generic.

My XSL is :

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:param name="mark">PN</xsl:param>
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <xsl:template match="/">
        <Root>
            <xsl:apply-templates/>
        </Root>
    </xsl:template>

    <xsl:template match="/Root/*">
        <xsl:for-each select=".">
            <xsl:choose>
                <xsl:when test="substring(name(), (string-length(name()) - string-length($mark)) + 1) = $mark">
                    <!-- Search for an ID tag -->
                    <xsl:copy>
                        <xsl:if test="node()/ID">
<xsl:for-each select="node()">
                                <xsl:sort select="ID" order="ascending" />
<!-- So far everything I've done here failed -->
<xsl:for-each select=".[ID = '1']">
                                <xsl:copy>
                                  <xsl:copy-of select="node()[not(number(TimeStamp) &lt; (preceding-sibling::node()/TimeStamp | following-sibling::node()/TimeStamp))]"/>
                                  </xsl:copy>
                                </xsl:for-each>
<!-- This is just an example, I don't want to have ID = 1 and ID = 2 -->
</xsl:for-each>
                        </xsl:if>

                        <xsl:if test="not(node()/ID)">
                            <xsl:copy-of select="node()[not(number(TimeStamp) &lt; (preceding-sibling::node()/TimeStamp | following-sibling::node()/TimeStamp))]"/>
                        </xsl:if>
                    </xsl:copy>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:copy-of select="."/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

I hope I made myself clear. Thanks in advance for all the help you could bring me !

EDIT :

I'm really sorry folks I should have mentionned that I wanted to make it as generic as possible. In my example I'm talking about PatientsPN but what I'm trying to do really is that matches every PARENT nodes ending with PN (hence the ends-with copycat version of the XSL 1.0).

You are really amazing anyway, I couldn't expect more coming from you. Thanks !

SOLUTION After remodeling the solution given by Dimitre, I came up with this XSL:

<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kPatById" match="*['PN' = substring(name(), string-length(name()) -1)]/*" 
use="concat(generate-id(..), '|', ID)"/>

<xsl:template match="node()|@*">
 <xsl:copy>
  <xsl:apply-templates select="node()|@*"/>
 </xsl:copy>
</xsl:template>

<xsl:template match="*['PN' = substring(name(), string-length(name()) -1)]">
 <xsl:copy>
<xsl:apply-templates select="node()">
 <xsl:sort select="ID" data-type="number"/>
</xsl:apply-templates>
 </xsl:copy>
</xsl:template>

<xsl:template match="*['PN' = substring(name(), string-length(name()) -1)]/node()[TimeStamp &lt; key('kPatById', concat(generate-id(..), '|', ID))/TimeStamp]"/>
</xsl:stylesheet>

It does the job wonderfully and it allows me to have multiple Parent nodes that are going to be treated and sorted.

like image 852
bosam Avatar asked Dec 13 '11 11:12

bosam


2 Answers

Can be as simple as that:

I. XSLT 1.0 solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kPatById" match=
 "*['PN' = substring(name(), string-length(name()) -1)]/Patient"
  use="concat(generate-id(..), '|', ID)"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
 "*['PN' = substring(name(), string-length(name()) -1)]">
  <xsl:apply-templates select="Patient">
    <xsl:sort select="ID" data-type="number"/>
  </xsl:apply-templates>
 </xsl:template>

 <xsl:template match=
 "*['PN' = substring(name(), string-length(name()) -1)]
     /Patient
       [TimeStamp &lt; key('kPatById', concat(generate-id(..), '|', ID))/TimeStamp]
  "/>
</xsl:stylesheet>

When applied on the provided XML document:

<Root>
....
    <PatientsPN>
        <Patient>
            <ID>1</ID>
            <TimeStamp>20111208165819</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>4</ID>
            <TimeStamp>20111208165910</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny4</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>4</ID>
            <TimeStamp>20111208165902</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>FannyMOI</PrenomPatient>
            <Sexe>M</Sexe>
        </Patient>
        <Patient>
            <ID>2</ID>
            <TimeStamp>20111208170000</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>FannyMOI</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>2</ID>
            <TimeStamp>20111208165819</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Fanny</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>2</ID>
            <TimeStamp>20111208170050</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Cmoi2</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
        <Patient>
            <ID>3</ID>
            <TimeStamp>20111208165829</TimeStamp>
            <NomPatient>Dudule</NomPatient>
            <PrenomPatient>Jesuis3</PrenomPatient>
            <Sexe>F</Sexe>
        </Patient>
    </PatientsPN>
...
</Root>

the wanted, correct result is produced:

<Root>
....
    <Patient>
      <ID>1</ID>
      <TimeStamp>20111208165819</TimeStamp>
      <NomPatient>Dudule</NomPatient>
      <PrenomPatient>Fanny</PrenomPatient>
      <Sexe>F</Sexe>
   </Patient>
   <Patient>
      <ID>2</ID>
      <TimeStamp>20111208170050</TimeStamp>
      <NomPatient>Dudule</NomPatient>
      <PrenomPatient>Cmoi2</PrenomPatient>
      <Sexe>F</Sexe>
   </Patient>
   <Patient>
      <ID>3</ID>
      <TimeStamp>20111208165829</TimeStamp>
      <NomPatient>Dudule</NomPatient>
      <PrenomPatient>Jesuis3</PrenomPatient>
      <Sexe>F</Sexe>
   </Patient>
   <Patient>
      <ID>4</ID>
      <TimeStamp>20111208165910</TimeStamp>
      <NomPatient>Dudule</NomPatient>
      <PrenomPatient>Fanny4</PrenomPatient>
      <Sexe>F</Sexe>
   </Patient>
...
</Root>

Explanation:

  1. Matching any element with name that ends with "PN" -- using a combination of substring() and string-length().

  2. Overriding the identity rule.

  3. Sorting, but not using any Muenchian grouping.

  4. Using a key to get all records of the same patient under the same xxxPN parent.

  5. "Simple" maximum (without sorting).

  6. Proper pattern matching to exclude any unwanted record.


II. XSLT 2.0 solution:

The XSLT 2.0 solution I find best is almost the same as the XSLT 1.0 solution above, but may be more efficient:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kPatById" match="*[ends-with(name(),'PN')]/Patient"
          use="concat(generate-id(..), '|', ID)"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="*[ends-with(name(),'PN')]">
  <xsl:apply-templates select="Patient">
    <xsl:sort select="ID" data-type="number"/>
  </xsl:apply-templates>
 </xsl:template>

 <xsl:template match=
 "*[ends-with(name(),'PN')]
     /Patient
        [number(TimeStamp)
        lt
          max((key('kPatById', concat(generate-id(..), '|', ID))
                                             /TimeStamp/xs:double(.)))
        ]"/>
</xsl:stylesheet>
like image 195
Dimitre Novatchev Avatar answered Oct 12 '22 21:10

Dimitre Novatchev


The first answer (didn't check) seems to be correct, but i couldn't resist posting a XSLT 1.0 version that does not use the 'evil' for-each keyword.

The sorting is done by concatinating the ID and timestamp before sorting.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
    <xsl:output method="xml" indent="yes"/>

  <xsl:template match="PatientsPN">
    <xsl:copy>
      <xsl:apply-templates select="//Patient">
        <xsl:sort select="concat(ID,TimeStamp)"/>
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="Patient">
    <xsl:if test="not(ID=following-sibling::Patient/ID)">
      <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
      </xsl:copy>
    </xsl:if>
  </xsl:template>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Hope this helps,

like image 43
Marvin Smit Avatar answered Oct 12 '22 21:10

Marvin Smit