Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Json.Net deserialize out of memory issue

I got a Json, which contains among others a data field which stores a base64 encoded string. This Json is serialized and send to a client.

On client side, the newtonsoft json.net deserializer is used to get back the Json. However, if the data field becomes large (~ 400 MB), the deserializer will throw an out of memory exception: Array Dimensions exceeded supported Range. I also see in Task-Manager, that memory consumption really grows fast.

Any ideas why this is? Is there a maximum size for json fields or something?

Code example (simplified):

HttpResponseMessage responseTemp = null;
responseTemp = client.PostAsJsonAsync(client.BaseAddress, message).Result;

string jsonContent = responseTemp.Content.ReadAsStringAsync.Result;
result = JsonConvert.DeserializeObject<Result>(jsonContent);

Result class:

public class Result
{

    public string Message { get; set; }
    public byte[] Data { get; set; }

}

UPDATE:

I think my problem is not the serializer, but just trying to handle such a huge string in memory. At the point where I read the string into memory, the memory consumption of the application explodes. Every operation on that string does the same. At the moment, I think I have to find a way to work with streams and stop reading the whole stuff into memory at once.

like image 736
DanielG Avatar asked Nov 02 '15 14:11

DanielG


3 Answers

To read large JSON string with use of JsonConvert.DeserializeObject will consume your lots of memory. So One of the ways to over come from this issue, you can create an instance of JsonSerializer as given below.

 using (StreamReader r = new StreamReader(filePath))
 {
          using (JsonReader reader = new JsonTextReader(r))
         {
                JsonSerializer serializer = new JsonSerializer();
                T lstObjects = serializer.Deserialize<T>(reader);
        }
}

Here filePath :- is your current Json file and T :- is your Generic type object.

like image 132
Dilip0165 Avatar answered Nov 08 '22 03:11

Dilip0165


You have two problems here:

  1. You have a single Base64 data field inside your JSON response that is larger than ~400 MB.

  2. You are loading the entire response into an intermediate string jsonContent that is even larger since it embeds the single data field.

Firstly, I assume you are using 64 bit. If not, switch.

Unfortunately, the first problem can only be ameliorated and not fixed because Json.NET's JsonTextReader does not have the ability to read a single string value in "chunks" in the same way as XmlReader.ReadValueChunk(). It will always fully materialize each atomic string value. But .Net 4.5 adds the following settings that may help:

  1. <gcAllowVeryLargeObjects enabled="true" />.

    This setting allows for arrays with up to int.MaxValue entries even if that would cause the underlying memory buffer to be larger than 2 GB. You will still be unable to read a single JSON token of more than 2^31 characters in length, however, since JsonTextReader buffers the full contents of each single token in a private char[] _chars; array, and, in .Net, an array can only hold up to int.MaxValue items.

  2. GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce.

    This setting allows the large object heap to be compacted and may reduce out-of-memory errors due to address space fragmentation.

The second problem, however, can be addressed by streaming deserialization, as shown in this answer to this question by Dilip0165; Efficient api calls with HttpClient and JSON.NET by John Thiriet; Performance Tips: Optimize Memory Usage by Newtonsoft; and Streaming with New .NET HttpClient and HttpCompletionOption.ResponseHeadersRead by Tugberk Ugurlu. Pulling together the information from these sources, your code should look something like:

Result result;
var requestJson = JsonConvert.SerializeObject(message); // Here we assume the request JSON is not too large
using (var requestContent = new StringContent(requestJson, Encoding.UTF8, "application/json"))
using (var request = new HttpRequestMessage(HttpMethod.Post, client.BaseAddress) { Content = requestContent })
using (var response = client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).Result)
using (var responseStream = response.Content.ReadAsStreamAsync().Result)
{
    if (response.IsSuccessStatusCode)
    {
        using (var textReader = new StreamReader(responseStream))
        using (var jsonReader = new JsonTextReader(textReader))
        {
            result = JsonSerializer.CreateDefault().Deserialize<Result>(jsonReader);
        }
    }
    else
    {
        // TODO: handle an unsuccessful response somehow, e.g. by throwing an exception
    }
}

Or, using async/await:

Result result;
var requestJson = JsonConvert.SerializeObject(message); // Here we assume the request JSON is not too large
using (var requestContent = new StringContent(requestJson, Encoding.UTF8, "application/json"))
using (var request = new HttpRequestMessage(HttpMethod.Post, client.BaseAddress) { Content = requestContent })
using (var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead))
using (var responseStream = await response.Content.ReadAsStreamAsync())
{
    if (response.IsSuccessStatusCode)
    {
        using (var textReader = new StreamReader(responseStream))
        using (var jsonReader = new JsonTextReader(textReader))
        {
            result = JsonSerializer.CreateDefault().Deserialize<Result>(jsonReader);
        }
    }
    else
    {
        // TODO: handle an unsuccessful response somehow, e.g. by throwing an exception
    }
}           

My code above isn't fully tested, and error and cancellation handling need to be implemented. You may also need to set the timeout as shown here and here. Json.NET's JsonSerializer does not support async deserialization, making it a slightly awkward fit with the asynchronous programming model of HttpClient.

Finally, as an alternative to using Json.NET to read a huge Base64 chunk from a JSON file, you could use the reader returned by JsonReaderWriterFactory which does support reading Base64 data in manageable chunks. For details, see this answer to Parse huge OData JSON by streaming certain sections of the json to avoid LOH for an explanation of how stream through a huge JSON file using this reader, and this answer to Read stream from XmlReader, base64 decode it and write result to file for how to decode Base64 data in chunks using XmlReader.ReadElementContentAsBase64

like image 43
dbc Avatar answered Nov 08 '22 01:11

dbc


Huge base64 strings aren't a problem as such, .Net supports object sizes of around 2gb, see the answer here. Of course, that doesn't mean you can store 2gb of information in an object!

However, I get the feeling that it's the byte[] that's the problem.

If there's too many elements for a byte[] to contain, it doesn't matter if you stream the result or even read it from a file on your hard drive.

So, just for testing purposes, can you trying changing the type of that from byte[] to string or even perhaps a List? It's not elegant or event perhaps advisable, but it might point the way to a better solution.

Edit:

Another test case to try, instead of calling deserializeObject, try just saving that jsonContent string to a file, and see how big it is?

Also, why do you need it in memory? What sort of data is it? It seems to me that if you've got to process this in memory then you're going to have a bad time - the size of the object is simply too large for the CLR.

Just had a little inspiration however, what about trying a different deserializer? Perhaps RestSharp or you can use HttpClient.ReadAsAsync<T>. It is possible that it's NewtonSoft itself that has a problem, especially if the size of the content is around 400mb.

like image 2
Russ Clarke Avatar answered Nov 08 '22 03:11

Russ Clarke