Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Newtonsoft Json deserializer not releasing memory

I'm using a StreamReader with JsonTextReader to deserialize a large JSON file containing tens of thousands of small objects, and its consuming way more memory than I think is reasonable (and running out). I'm using what I understand is the recommended pattern for reading large files.

Code simplified for expository purposes:

using (StreamReader streamReader = new StreamReader(stream))
using (JsonTextReader reader = new JsonTextReader(streamReader))
{
    JToken token;
    while (reader.Read() && reader.TokenType != JsonToken.EndArray)
    {
        token = JToken.Load(reader);
        RawResult result = token.ToObject<RawResult>();
        results.Add(result);
    }
}

The VS2015 memory profiler is telling me that most of the memory is being consumed by Newtonsoft.Json.Linq.JValue objects, which is bizarre because once the current token has been converted ToObject() there is no reason (as far as I am concerned) why it shouldn't just be discarded.

I'm assuming that the Newtonsoft library is retaining all of the JSON parsed so far in memory. I don't need it to do this and I think if I could prevent this my memory problems would go away.

What can be done?

like image 296
Ian Goldby Avatar asked Apr 06 '17 13:04

Ian Goldby


1 Answers

It doesn't look like you need to be using JTokens as an intermediary; you could just deserialize directly to your RawResult class inside your loop.

using (StreamReader streamReader = new StreamReader(stream))
using (JsonTextReader reader = new JsonTextReader(streamReader))
{
    var serializer = new JsonSerializer();
    while (reader.Read() && reader.TokenType != JsonToken.EndArray)
    {
        RawResult result = serializer.Deserialize<RawResult>(reader);
        results.Add(result);
    }
}

Also note that by adding your result items to a list, you are keeping them all in memory. If you can process them one at a time and write each result individually to your output (file, database, network stream, etc.) you can save memory that way also.

        RawResult result = serializer.Deserialize<RawResult>(reader);
        ProcessResult(result);  // process result now instead of adding to a list
like image 107
Brian Rogers Avatar answered Oct 30 '22 01:10

Brian Rogers