I have some very large XML files (800 MB to 1.5 GB). I need to apply XSLT on that. I am able to read it XMLTextReader. When i applied XSLT transformation, get SystemOutOfMemory Exception.
My code looks like;
static void Main(string[] args)
{
XDocument newTree = new XDocument();
XmlTextReader oReader = new XmlTextReader(@"C:\Projects\myxml.xml");
using (XmlWriter writer = newTree.CreateWriter())
{
XslCompiledTransform oTransform = new XslCompiledTransform();
oTransform.Load(@"C:\Projects\myXSLT.xsl");
oTransform.Transform(oReader, writer);
}
Console.WriteLine(newTree);
}
Thanks in advance. It is very urgent. If I don't get any solution, I need to split XML into smaller XML and do transformation.
XSLT is very widely used. As far as we can judge from metrics like the number of StackOverflow questions, it is in the top 30 programming languages, which probably makes it the top data-model-specific programming language after SQL. But XSLT isn't widely used client-side, that is, in the browser.
Even though the maximum file size is set to 100 MB, it is still possible to import an XML file larger than 100 MB via P6 Professional. The issue can be reproduced at will with the following steps: 1. In P6 Admin, set the Services --> Import / Export Options --> Maximum file size to 102 000 (102 MB).
XSLT is commonly used to convert XML to HTML, but can also be used to transform XML documents that comply with one XML schema into documents that comply with another schema. XSLT can also be used to convert XML data into unrelated formats, like comma-delimited text or formatting languages such as troff.
This post shows you how to convert a simple XML file to CSV using XSLT. The following XSL Style Sheet (compatible with XSLT 1.0) can be used to transform the XML into CSV. It is quite generic and can easily be configured to handle different xml elements by changing the list of fields defined ar the beginning.
XSLT uses XPath and this requires that the whole XML document be maintained in memory. Thus the problem of insufficient memory is by definition.
There are simle rules to approximate how much memory is needed and one of them says 5 * text-size
.
So, for a "typical 1.5GB XML file" 8GB RAM may be sufficient.
Either split the document into smaller parts or wait for an implementation of XSLT 2.1, which defines special streaming instructions. In the meantime one may use the latest (commercial) version of Saxon, which implements extensions for streaming and successful processing of 64GB document has been reported on twitter.
we are facing a similar problem. The solution we came uo with was to not use xslt for this case, and instead use Linq to Xml transformations while stteaming the data. You can leverage the c# yield keyword to iterate through an xml stream and tackle the file piecemeal this way. See streaming with linq to xml
the nature of xslt requires the xml to be loaded into memory. what needs to occur is you need to break down the large file into more managable pieces. if you use the xml streaming technique, you can break the document up into sub elements which you can then individually apply the xslt to. you may have to rewrite the xslt to accomodate this behavior.
Aside from this, the only other option is to throw more hardware at it, but this might even require an operating system upgrade depending on RAM limitations...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With