Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting large xml files in to sub files without memory contention

Tags:

c#

xml

I have a XML like following

<Jobs>
   <job>
   ....
   </job>
   <job>
   ....
   </job>
   ....
</Jobs>

Now what is best way to write each job node in a separate file without bringing the whole file in to memory using xmlreader and xmlwriter or anyother options?

like image 917
Umamaheswaran Avatar asked Aug 04 '12 05:08

Umamaheswaran


2 Answers

  1. Create an XmlReader for the input file.
  2. Position the reader on the first job element.
  3. Create a subtree XmlReader using the ReadSubtree Method.
  4. Create an XmlWriter for the output file.
  5. Copy the contents of the subtree XmlReader into the XmlWriter using the WriteNode Method.
  6. Position the original reader on the next job element, continue as with the first job element.
    Break if there are no more job elements to read.
like image 67
dtb Avatar answered Nov 04 '22 04:11

dtb


It's early days yet for XSLT 3.0 and streaming, but the following XSLT 3.0 stylesheet should do the job in Saxon-EE 9.4:

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:mode streamable="yes" on-no-match="shallow-copy">
<xsl:template match="job">
  <xsl:result-document href="job{position()}.xml">
    <xsl:next-match/>
  </xsl:result-document>
</xsl:template>
</xsl:stylesheet>
like image 2
Michael Kay Avatar answered Nov 04 '22 05:11

Michael Kay