Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient mapping of two large lists

I've been tasked with writing some XSLT 2.0 to translate an XML document to another XML document. I'm relatively new to XSLT but I have learn alot during the days I've do this. During this time I have had to map simple values, i.e. 002 -> TH etc. This has been fine for small lists of less than 10 values, I used xsl:choose. However I need to map over 300 values from one list to another and vice versa. Each list has a value and textual description. The two list values do not always directly map, so I may have to compare textual descriptions and use default values if necessary.

I have two solutions to the problem:

  1. Use xsl:choose: This I think could be slow and possible hard to update if either of the lists changes.

  2. Have a XML document with the relationship between each list item. I would use an XPath expressions to retrieve an associated value: This is my preferred solution because I believe it will be more maintainable and easier to update. Although I'm not sure it is efficient.

What solution should I use, one of my suggestion, or is there a better way to map these values?


2 Answers

Here is an XSLT 2.0 solution.

Source XML file:

<input>
  <data>001</data>
  <data>002</data>
  <data>005</data>
</input>

"Mapping" xml file:

<map>
  <default>?-?-?</default>
    <input value="001">RZ</input>
    <input value="002">TH</input>
    <input value="003">SC</input>
</map>

XSLT transformation:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:param name="pmapFile" 
       select="'C:/temp/deleteMap.xml'" />

  <xsl:variable name="vMap" 
       select="document($pmapFile)" />

  <xsl:variable name="vDefault" 
       select="$vMap/*/default/text()" />

  <xsl:key name="kInputByVal" match="input" 
   use="@value" />

  <xsl:template match="/*">
    <output>
      <xsl:apply-templates/>
    </output>
  </xsl:template>

  <xsl:template match="data">
    <data>
        <xsl:sequence select= 
         "(key('kInputByVal', ., $vMap)[1]/text(),
           $vDefault
           )[1]
         "/>
    </data> 
  </xsl:template>
</xsl:stylesheet>

Output:

<output>
  <data>RZ</data>
  <data>TH</data>
  <data>?-?-?</data>
</output>

Do note the following:

  1. The use of the document() function to access the "mapping" xml document, which is stored in a separate XML file.

  2. The use of <xsl:key/> and the XSLT 2.0 key() function to determine and access each corresponding output value. The third argument specifies the xml document that must be accessed and indexed.

like image 98
Dimitre Novatchev Avatar answered May 23 '26 13:05

Dimitre Novatchev


Here is a way to do what you intend, using an <xsl:key> and otherwise following your method two.

The sample input file (data.xml):

<?xml version="1.0" encoding="utf-8"?>
<input>
  <data>001</data>
  <data>002</data>
  <data>005</data>
</input>

The sample map file (map.xml):

<?xml version="1.0" encoding="utf-8"?>
<map default="??">
  <entry key="001">RZ</entry>
  <entry key="002">TH</entry>
  <entry key="003">SC</entry>
</map>

The sample XSL stylesheet, explanation follows:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="utf-8" indent="yes"/>

  <xsl:param name="map-file" select="string('map.xml')" />
  <xsl:variable name="map-doc" select="document($map-file)" />
  <xsl:variable name="default-value" select="$map-doc/map/@default" />
  <xsl:key name="map" match="/map/entry" use="@key" />

  <xsl:template match="/input">
    <output>
      <xsl:apply-templates select="data" />
    </output>
  </xsl:template>

  <xsl:template match="data">
    <xsl:variable name="raw-value" select="." />
    <xsl:variable name="mapped-value">
      <xsl:for-each select="$map-doc">
        <xsl:value-of select="key('map', $raw-value)" />
      </xsl:for-each>
    </xsl:variable>
    <data>
      <xsl:choose>
        <xsl:when test="$mapped-value = ''">
          <xsl:value-of select="$default-value" />
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$mapped-value" />
        </xsl:otherwise>
      </xsl:choose>
    </data>
  </xsl:template>
</xsl:stylesheet>

What this does is:

  • use document() to open map.xml, saving the resulting node-set to a variable
  • save the default value for further reference
  • prepare an <xsl:key> to work against the "map" node set
  • use <xsl:for-each> not as a loop, but as a means to switch the execution context before calling the key() function - otherwise key() would work against the "data" document and return nothing
  • find the corresponding node with the key() function, save it in a variable
  • check the variable value on output - if it is empty, use the default value
  • repeat (through <xsl:apply-templates>)

The credit for the neat <xsl:for-each> trick goes to Jeni Tennison, who described the technique on the XSL mailing list. Be sure to read the thread.

Output of running the stylesheet against data.xml:

<?xml version="1.0" encoding="utf-8"?>
<output>
  <data>RZ</data>
  <data>TH</data>
  <data>??</data>
</output>

All of this is XSLT 1.0. I'm convinced a better/more elegant version exists that makes use of the advantages XSLT 2.0 offers, but unfortunately I'm not overly familiar with XSLT 2.0. Maybe someone else posts a better solution.


EDIT

Through Dimitre Novatchev's hint in the comments, I was able to create a a considerably shorter and more preferable stylesheet:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="utf-8" indent="yes"/>

  <xsl:param name="map-file" select="string('map.xml')" />
  <xsl:variable name="map-doc" select="document($map-file)" />
  <xsl:variable name="default" select="$map-doc/map/default[1]" />
  <xsl:key name="map" match="/map/entry" use="@key" />

  <xsl:template match="/input">
    <output>
      <xsl:apply-templates select="data" />
    </output>
  </xsl:template>

  <xsl:template match="data">
    <xsl:variable name="raw-value" select="." />
    <data>
      <xsl:for-each select="$map-doc">
        <xsl:value-of select="(key('map', $raw-value)|$default)[1]" />
      </xsl:for-each>
    </data>
  </xsl:template>
</xsl:stylesheet>

However, this one requires a slightly different map file to work in XSLT 1.0:

<?xml version="1.0" encoding="utf-8"?>
<map>
  <entry key="001">RZ</entry>
  <entry key="002">TH</entry>
  <entry key="003">SC</entry>
  <!-- default entry must be last in document -->
  <default>??</default>
</map>
like image 45
Tomalak Avatar answered May 23 '26 12:05

Tomalak