Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xslt optimisation: access child multiple time or use variable

Tags:

xml

xslt

I need an information to optimize my xslt.

In my template I access a child multiple times like for example:

<xsl:template match="user">
 <h1><xsl:value-of select="address/country"/></h1>
 <p><xsl:value-of select="address/country"/></p>
 <p><xsl:value-of select="address/country"/></p>
  ... more and more...
 <p><xsl:value-of select="address/country"/></p>
</xsl:template>

Would it be better to store the content of the child element in a variable and directly call the variable to avoid to parse the tree everytime:

<xsl:template match="user">
 <xsl:variable name="country" select="address/country"/>
 <h1><xsl:value-of select="$country"/></h1>
 <p><xsl:value-of select="$country"/></p>
 <p><xsl:value-of select="$country"/></p>
  ... more and more...
 <p><xsl:value-of select="$country"/></p>
</xsl:template>

Or will the use of a variable consume more resources than parsing the tree multiple times?

like image 904
ylerjen Avatar asked Mar 20 '23 12:03

ylerjen


2 Answers

Usually, an XML file is parsed as a whole and held in memory as XDM. So, I guess that by

than parsing the tree multiple times

you actually meant accessing the internal representation of the XML input multiple times. The figure below illustrates this, we are talking about the source tree:

enter image description here
(taken from Michael Kay's XSLT 2.0 and XPath 2.0 Programmer's Reference, page 43)

Likewise, xsl:variable creates a node (or, more precisely, a temporary document) that is held in memory and that needs to be accessed, too.

Now, what exactly do you mean by optimisation? Do you mean the time it takes to perform the transformation or CPU and memory usage (as you mention "resources" in your question)?

Also, performance depends on the implementation of your XSLT processor of course. The only reliable way of finding out is to actually test this.

Write two stylesheets that differ only in this regard, that is, are identical otherwise. Then, let both of them transform the same input XML and measure the time they take.

My guess is that accessing a variable is faster and it is also more convenient to repeat a variable name than repeating full paths as you write code (this is sometimes called "convenience variables").


EDIT: Replaced with something more appropriate, as a response to your comment.

If you actually test this, write two stylesheets:

Stylesheet with variable

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="/root">
      <xsl:copy>
         <xsl:variable name="var" select="node/subnode"/>
         <subnode nr="1">
            <xsl:value-of select="$var"/>
         </subnode>
         <subnode nr="2">
            <xsl:value-of select="$var"/>
         </subnode>
      </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Stylesheet without variable

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="/root">
      <xsl:copy>
         <subnode nr="1">
            <xsl:value-of select="node/subnode"/>
         </subnode>
         <subnode nr="2">
            <xsl:value-of select="node/subnode"/>
         </subnode>
      </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Applied to the following input XML:

<root>
   <node>
      <subnode>helloworld</subnode>
   </node>
</root>

EDIT: As suggested by @Michael Kay, I measured the average time taken in 100 runs ("-t and -repeat:100 on the Saxon command line"):

with variable: 9 ms
without variable: 9 ms

This does not imply that the result is the same with your XSLT processor.

like image 143
Mathias Müller Avatar answered Mar 23 '23 03:03

Mathias Müller


For all performance questions, the answer is: it depends.

  • It depends what XSLT processor you are using, and on the optimizations it performs.

  • It's very likely to depend on how many children have to be searched to find the ones you are looking for.

The only way to find out is to measure it, and to measure it very carefully.

Personally, I would use a variable if there is a complex predicate involved, but not if I'm just looking for children by name.

In nearly all cases, even if it makes a difference, it is very unlikely to make a difference to the bottom line of your business. If you are interested in improving the bottom line of your business, there are probably better ways to employ your intellect.

like image 35
Michael Kay Avatar answered Mar 23 '23 02:03

Michael Kay