Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to overcome OutOfMemoryException pulling large xml documents from an API?

I am pulling 1M+ records from an API. The pull works ok, but I'm getting an out of memory exception when attempting to ReadToEnd into a string variable.

Here's the code:

        XDocument xmlDoc = new XDocument();

        HttpWebRequest client = (HttpWebRequest)WebRequest.Create(uri);
        client.Timeout = 2100000;//35 minutes
        WebResponse apiResponse = client.GetResponse();

        Stream receivedStream = apiResponse.GetResponseStream();
        StreamReader reader = new StreamReader(receivedStream);

        string s = reader.ReadToEnd();

Stack trace:

at System.Text.StringBuilder.ToString()
at System.IO.StreamReader.ReadToEnd()
at MyApplication.DataBuilder.getDataFromAPICall(String uri) in
    c:\Users\RDESLONDE\Documents\Projects\MyApplication\MyApplication\DataBuilder.cs:line 578
at MyApplication.DataBuilder.GetDataFromAPIAsXDoc(String uri) in
c:\Users\RDESLONDE\Documents\Projects\MyApplication\MyApplication\DataBuilder.cs:line 543

What can I do to work around this?

like image 521
richard Avatar asked Jan 14 '23 20:01

richard


1 Answers

It sounds like your file is too big for your environment. Loading the DOM for a large file can be problematic, especially when using the win32 platform (you haven't indicated whether this is the case).

You can combine the speed and memory efficiency of XmlReader with the convenience of XElement/Xnode, etc and use an XStreamingElement to save the transformed content after processing. This is much more memory-efficient for large files

Here's an example in pseudo-code:

    // use a XStreamingElement for writing
    var st = new XStreamingElement("root"); 
    using(var xr = new XmlTextReader(stream))
    {
        while (xr.Read())
        {
            // whatever you're interested in
            if (xr.NodeType == XmlNodeType.Element) 
            {
                var node = XNode.ReadFrom(xr) as XElement;
                if (node != null)
                {
                    ProcessNode(node);
                    st.Add(node);
                }
            }

        }
    }
    st.Save(outstream); // or st.WriteTo(xmlwriter);
like image 81
Anthill Avatar answered Feb 08 '23 22:02

Anthill