Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

System.Text.Json deserialization fails with JsonException "read to much or not enough"

This question applies to custom deserialization classes for System.Text.Json in .Net Core 3.1.

I'm trying to understand why custom deserialization class needs to read to the end of the JSON stream even though it has already produced the required data, otherwise the deserialization fails with JsonException that ends with "read too much or not enough."

I read through Microsoft Documentation for System.Text.Json ([1], [2]), but couldn't figure that out.

Here is an example of the document:

{
    "Response": {
        "Result": [
            {
                "Code": "CLF",
                "Id": 49,
                "Type": "H"
            },
            {
                "Code": "CLF",
                "Id": 42,
                "Type": "C"
            }
        ]
    }
}

The DTO class and deserialisation method are defined as following:

public class EntityDto
{
    public string Code { get; set; }
    public int Id { get; set; }
    public string Type { get; set; } 
}

// This method is a part of class EntityDtoIEnumerableConverter : JsonConverter<IEnumerable<EntityDto>>
public override IEnumerable<EntityDto> Read(
    ref Utf8JsonReader reader,
    Type typeToConvert,
    JsonSerializerOptions options)
{
    if (reader.TokenType != JsonTokenType.StartObject)
    {
        throw new JsonException("JSON payload expected to start with StartObject token.");
    }

    while ((reader.TokenType != JsonTokenType.StartArray) && reader.Read()) { }

    var eodPostions = JsonSerializer.Deserialize<EntityDto[]>(ref reader, options);

    // This loop is required to not get JsonException
    while (reader.Read()) { }

    return new List<EntityDto>(eodPostions);
}

Here is how the deserialization class is called.

var serializerOptions = new JsonSerializerOptions
{
    PropertyNameCaseInsensitive = true
};
serializerOptions.Converters.Add(new EntityDtoIEnumerableConverter());

HttpResponseMessage message = await httpClient.GetAsync(requestUrl);
message.EnsureSuccessStatusCode();

var contentStream = await msg.Content.ReadAsStreamAsync();
var result = await JsonSerializer.DeserializeAsync<IEnumerable<EntityDto>>(contentStream, serializerOptions);

When the last loop while (reader.Read()) { } in deserialization method is absent, or commented out, the last call await JsonSerializer.DeserializeAsync<... fails with the JsonException, which ends with read too much or not enough. Can anyone explain why? Or is there a better way to write this deserialization?

Updated the second block of code to use EntityDtoIEnumerableConverter.

like image 305
GKalnytskyi Avatar asked Feb 03 '23 15:02

GKalnytskyi


1 Answers

When reading an object, JsonConverter<T>.Read() must leave the Utf8JsonReader positioned on the EndObject token of the object where it was originally positioned. (And for arrays, the EndArray of the original array.) When writing a Read() method that parses through multiple levels of JSON, this can be done by remembering the CurrentDepth of the reader upon entry, and then reading until an EndObject is found at the same depth.

Since your EntityDtoIEnumerableConverter.Read() method seems to be trying to descend the JSON token hierarchy until an array is encountered, upon which it deserializes the array into an EntityDto[] (essentially peeling off the "Response" and "Result" wrapper properties), your code can be rewritten as follows:

public override IEnumerable<EntityDto> Read(
    ref Utf8JsonReader reader,
    Type typeToConvert,
    JsonSerializerOptions options)
{
    if (reader.TokenType != JsonTokenType.StartObject)
    {
        throw new JsonException("JSON payload expected to start with StartObject token.");
    }

    List<EntityDto> list = null;    
    var startDepth = reader.CurrentDepth;

    while (reader.Read())
    {
        if (reader.TokenType == JsonTokenType.EndObject && reader.CurrentDepth == startDepth)
            return list;
        if (reader.TokenType == JsonTokenType.StartArray)
        {
            if (list != null)
                throw new JsonException("Multiple lists encountered.");
            var eodPostions = JsonSerializer.Deserialize<EntityDto[]>(ref reader, options);
            (list = new List<EntityDto>(eodPostions.Length)).AddRange(eodPostions);
        }
    }
    throw new JsonException(); // Truncated file or internal error
}

Notes:

  • In your original code you returned as soon as the array was deserialized. Since JsonSerializer.Deserialize<EntityDto[]>(ref reader, options) only advances the reader to the end of the nested array, you never advanced the reader to the required object end. This caused the exception you were seeing. (Advancing until the end of the JSON stream also seems to have worked when the current object was the root object, but would not have worked for nested objects.)

  • None of the converters currently shown in the documentation article How to write custom converters for JSON serialization (marshalling) in .NET to which you linked attempt to flatten multiple levels of JSON into a single .Net object as you are doing, so the need to track the current depth seems not to have arisen there in practice.

Demo fiddle here.

like image 154
dbc Avatar answered Feb 06 '23 11:02

dbc