Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT 3.0 streaming (Saxon)

I have a big XML file (6 GB) with this kind of tree:

<Report>
   <Document>
      <documentType>E</documentType>
      <person>
         <firstname>John</firstname>
         <lastname>Smith</lastname>
      </person>
   </Document>
   <Document>
      [...]
   </Document>
   <Document>
      [...]
   </Document>
   [...]
</Report>

If I apply an XSLT style sheet on it, I have this error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

So I wanted to try the new XSLT 3.0 feature: streaming, with Saxon 9.6 EE. I don't want to have the streaming constrains once in a Document. I think that, what I want to do, is very close to the "burst mode" that is described here: http://saxonica.com/documentation/html/sourcedocs/streaming/burst-mode-streaming.html

Here is my Saxon command line:

java -cp saxon9ee.jar net.sf.saxon.Transform -t -s:input.xml -xsl:stylesheet.xsl -o:output/output.html

Here is my XSLT style sheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:mode streamable="yes" />

<xsl:template match="/">
    GLOBAL HEADER
        <xsl:iterate select="copy-of()/Report/Document" >
           DOC HEADER
           documentType: <xsl:value-of select="documentType"/>
           person/firstname: <xsl:value-of select="person/firstname"/>
           DOC FOOTER
           <xsl:next-iteration/>
        </xsl:iterate>
    GLOBAL FOOTER
</xsl:template>

</xsl:stylesheet>

But I still have the same out of memory error.

Thank you for your help!

like image 810
steco Avatar asked Oct 06 '14 22:10

steco


1 Answers

Your copy-of() is copying the context item, which is the entire document. You want

copy-of(/Report/Document)

which copies each Document in turn. Or I tend to write it

/Report/Document/copy-of()

because I think it makes it clearer what is going on.

Incidentally you don't need xsl:iterate here: xsl:for-each will do the job perfectly well, because processing of one Document doesn't depend on the processing of any previous documents.

like image 187
Michael Kay Avatar answered Oct 19 '22 22:10

Michael Kay