Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xsl:sort: sorting by numeric value

I have to sort out the codes in numerical order. The codes have four characters and four numerals.

for example,

COMP2100
COMP2400
COMP3410
LAWS2202
LAWS2250

when I just do <xsl:sort select="code" order="ascending" /> it displays above result.

However, I want that to be in 'numerical order' that is

COMP2100
LAWS2202
COMP2250
COMP2400
COMP3410

How do I do this?

like image 723
Jane Doe Avatar asked Dec 16 '22 19:12

Jane Doe


2 Answers

Note: the OP has now provided sample XML. The below theories can be trivially adapted to this XML.

I. XSLT 1.0 (part 1)

Here is a simple solution that assumes your assertion ("the codes have four characters and four numerals") will always be the case:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" />
  <xsl:strip-space elements="*" />

  <xsl:variable name="vNums" select="'1234567890'" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/*">
    <t>
      <xsl:apply-templates>
        <xsl:sort select="substring(., 5)"
          data-type="number" />
      </xsl:apply-templates>
    </t>
  </xsl:template>
</xsl:stylesheet> 

...is applied to an imagined XML document, shuffled into random order:

<?xml version="1.0" encoding="utf-8"?>
<t>
  <i>COMP3410</i>
  <i>LAWS2202</i>
  <i>COMP2400</i>
  <i>COMP2100</i>
  <i>LAWS2250</i>
</t>

...the correct result is produced:

<?xml version="1.0" encoding="utf-8"?>
<t>
  <i>COMP2100</i>
  <i>LAWS2202</i>
  <i>LAWS2250</i>
  <i>COMP2400</i>
  <i>COMP3410</i>
</t>

Explanation:

  • The Identity Transform -- one of the (if not the) most fundamental design patterns in XSLT -- copies all nodes from the source XML document to the result XML document as-is.
  • One template overrides the Identity Transform by sorting all children of <t> based upon the characters in the string from position 5 to the string's end.

Again, note that this solution assumes your original assertion -- "the codes have four characters and four numerals" -- is (and always will be) true.


II. XSLT 1.0 (part 2)

A (potentially) safer solution would be to assume that there might be numerous non-numeric characters in various positions within the <i> nodes. In that case, this XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" />
  <xsl:strip-space elements="*" />

  <xsl:variable name="vNums" select="'1234567890'" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/*">
    <t>
      <xsl:apply-templates>
        <xsl:sort select="translate(., translate(., $vNums, ''), '')"
          data-type="number" />
      </xsl:apply-templates>
    </t>
  </xsl:template>
</xsl:stylesheet>

...provides the same result:

<?xml version="1.0" encoding="utf-8"?>
<t>
  <i>COMP2100</i>
  <i>LAWS2202</i>
  <i>LAWS2250</i>
  <i>COMP2400</i>
  <i>COMP3410</i>
</t>

Explanation:

  • The Identity Transform is once again used.
  • In this case, the additional template uses the so-called Double Translate Method (first proposed by Michael Kay and first shown to me by Dimitre Novatchev) to remove all non-numeric characters from the value of each <i> element before sorting.

III. XSLT 2.0 Solution

Here's a possible XSLT 2.0 solution is very similar to part 2 of the XSLT 1.0 solution; it merely replaces the Double Translate Method with XPath 2.0's ability to handle regular expressions:

<xsl:sort select="replace(., '[^\d]', '')" data-type="number" />

Note that by no means are you required to use regular expressions in XPath 2.0; the Double Translate Method works just as well as in XPath 1.0. The replace() method will, however, most likely be more efficient.

like image 134
ABach Avatar answered Jan 26 '23 01:01

ABach


There are two obvious errors in the provided XSLT code:

  1. The namespace used to select elements is different from the default namespace of the provided XML document. Just change: xmlns:xsi="file://Volumes/xxxxxxx/Assignment" to xmlns:xsi="file://Volumes/xxxxxxx/Assignment".

  2. The sort at present is not numeric. Change:

    <xsl:sort select="xsi:code" order="ascending" />

to:

   <xsl:sort select="substring(xsi:code, 5)" data-type="number" />

The complete transformation becomes:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:fn="http://www.w3.org/2005/xpath-functions"
 xmlns:xsi="file://Volumes/u4783938/Assignment">
<xsl:template match="/">
    <html>
    <head>
        <title> Course Catalogue </title>
    </head>
    <body bgcolor="#FF9999">
        <h1> <div style="text-align:center"> Course Catalogue </div> </h1>
        <xsl:for-each select="xsi:catalogue/xsi:course">
        <xsl:sort select="substring(xsi:code, 5)"
         data-type="number" />
        <div style="width:1000px;margin-bottom:4px;color:white;background-color:#F36;text-align:justify;border:outset;margin-left:auto;margin-right:auto;">
            <xsl:apply-templates select="xsi:code" />
            <br />
            <xsl:apply-templates select="xsi:title" />
            <br />
            <xsl:apply-templates select="xsi:year" />
            <br />
            <xsl:apply-templates select="xsi:science" />
            <br />
            <xsl:apply-templates select="xsi:area" />
            <br />
            <xsl:apply-templates select="xsi:subject" />
            <br />
            <xsl:apply-templates select="xsi:updated" />
            <br />
            <xsl:apply-templates select="xsi:unit" />
            <br />
            <xsl:apply-templates select="xsi:description" />
            <br />
            <xsl:apply-templates select="xsi:outcomes" />
            <br />
            <xsl:apply-templates select="xsi:incompatibility" />
        </div>
        </xsl:for-each>
    </body>
    </html>
</xsl:template>
</xsl:stylesheet>

and when applied on this XML document:

<catalogue xmlns="file://Volumes/u4783938/Assignment"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="file://Volumes/u4443554/Assignment/courses.xsd">
    <course>
        <code>ABCD3410</code>
        <title> Information Technology in Electronic Commerce </title>
        <year>later year</year>
        <science>C</science>
        <area> Research School of Computer Science </area>
        <subject> Computer Science </subject>
        <updated>2012-03-13T13:12:00</updated>
        <unit>6</unit>
        <description>Tce </description>
        <outcomes>Up trCommerce. </outcomes>
        <incompatibility>COMP1100</incompatibility>
    </course>
    <course>
        <code>COMP2011</code>
        <title> Course 2011 </title>
        <year>Year 2011</year>
        <science>C++</science>
        <area> Research School of Computer Science </area>
        <subject> Computer Science </subject>
        <updated>2012-03-13T13:12:00</updated>
        <unit>6</unit>
        <description>Tce </description>
        <outcomes>Up trCommerce. </outcomes>
        <incompatibility>COMP1100</incompatibility>
    </course>
</catalogue>

the produced result is now correctly sorted by the numeric part of the course code:

<html xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xsi="file://Volumes/u4783938/Assignment">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <title> Course Catalogue </title>
   </head>
   <body bgcolor="#FF9999">
      <h1>
         <div style="text-align:center"> Course Catalogue </div>
      </h1>
      <div style="width:1000px;margin-bottom:4px;color:white;background-color:#F36;text-align:justify;border:outset;margin-left:auto;margin-right:auto;">COMP2011<br> Course 2011 <br>Year 2011<br>C++<br> Research School of Computer Science <br> Computer Science <br>2012-03-13T13:12:00<br>6<br>Tce <br>Up trCommerce. <br>COMP1100
      </div>
      <div style="width:1000px;margin-bottom:4px;color:white;background-color:#F36;text-align:justify;border:outset;margin-left:auto;margin-right:auto;">ABCD3410<br> Information Technology in Electronic Commerce <br>later year<br>C<br> Research School of Computer Science <br> Computer Science <br>2012-03-13T13:12:00<br>6<br>Tce <br>Up trCommerce. <br>COMP1100
      </div>
   </body>
</html>
like image 41
Dimitre Novatchev Avatar answered Jan 26 '23 02:01

Dimitre Novatchev