Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is equivalent in JToken.DeepEquals in System.Text.Json?

I want to migrate my code from Newtonsoft Json.Net to Microsoft standard System.Text.Json. But I could not find an alternative for JToken.DeepEqual

Basically the code must compare two JSON in unit test. Reference JSON, and Result JSON. I used the mechanism in Newtonsoft to create two JObject and then compare them with JToken.DeepEqual. Here is the example code:

[TestMethod]
public void ExampleUnitTes()
{
    string resultJson = TestedUnit.TestedMethod();
    string referenceJson =
    @"
    {
      ...bla bla bla...
      ...some JSON Content...
      ...bla bla bla...
    }";

    JObject expected = ( JObject )JsonConvert.DeserializeObject( referenceJson );
    JObject result = ( JObject )JsonConvert.DeserializeObject( resultJson );
    Assert.IsTrue( JToken.DeepEquals( result, expected ) );
}

If I am correct the Newtonsoft JObject similar in System.Text.Json.JsonDocument, and I am able to create it, just I don't know how to compare the contents of it.

System.Text.Json.JsonDocument expectedDoc = System.Text.Json.JsonDocument.Parse( referenceJson );
System.Text.Json.JsonDocument resultDoc = System.Text.Json.JsonDocument.Parse( json );

Compare???( expectedDoc, resulDoc );

Of course, string compare is not a solution, because the format of the JSON doesn't matter and the order of the properties also doesn't matter.

like image 264
György Gulyás Avatar asked Feb 04 '23 15:02

György Gulyás


1 Answers

There is no equivalent in System.Text.Json as of .Net 3.1, so we will have to roll our own. Here's one possible IEqualityComparer<JsonElement>:

public class JsonElementComparer : IEqualityComparer<JsonElement>
{
    public JsonElementComparer() : this(-1) { }

    public JsonElementComparer(int maxHashDepth) => this.MaxHashDepth = maxHashDepth;

    int MaxHashDepth { get; } = -1;

    #region IEqualityComparer<JsonElement> Members

    public bool Equals(JsonElement x, JsonElement y)
    {
        if (x.ValueKind != y.ValueKind)
            return false;
        switch (x.ValueKind)
        {
            case JsonValueKind.Null:
            case JsonValueKind.True:
            case JsonValueKind.False:
            case JsonValueKind.Undefined:
                return true;
                
            // Compare the raw values of numbers, and the text of strings.
            // Note this means that 0.0 will differ from 0.00 -- which may be correct as deserializing either to `decimal` will result in subtly different results.
            // Newtonsoft's JValue.Compare(JTokenType valueType, object? objA, object? objB) has logic for detecting "equivalent" values, 
            // you may want to examine it to see if anything there is required here.
            // https://github.com/JamesNK/Newtonsoft.Json/blob/master/Src/Newtonsoft.Json/Linq/JValue.cs#L246
            case JsonValueKind.Number:
                return x.GetRawText() == y.GetRawText();

            case JsonValueKind.String:
                return x.GetString() == y.GetString(); // Do not use GetRawText() here, it does not automatically resolve JSON escape sequences to their corresponding characters.
                
            case JsonValueKind.Array:
                return x.EnumerateArray().SequenceEqual(y.EnumerateArray(), this);
            
            case JsonValueKind.Object:
                {
                    // Surprisingly, JsonDocument fully supports duplicate property names.
                    // I.e. it's perfectly happy to parse {"Value":"a", "Value" : "b"} and will store both
                    // key/value pairs inside the document!
                    // A close reading of https://www.rfc-editor.org/rfc/rfc8259#section-4 seems to indicate that
                    // such objects are allowed but not recommended, and when they arise, interpretation of 
                    // identically-named properties is order-dependent.  
                    // So stably sorting by name then comparing values seems the way to go.
                    var xPropertiesUnsorted = x.EnumerateObject().ToList();
                    var yPropertiesUnsorted = y.EnumerateObject().ToList();
                    if (xPropertiesUnsorted.Count != yPropertiesUnsorted.Count)
                        return false;
                    var xProperties = xPropertiesUnsorted.OrderBy(p => p.Name, StringComparer.Ordinal);
                    var yProperties = yPropertiesUnsorted.OrderBy(p => p.Name, StringComparer.Ordinal);
                    foreach (var (px, py) in xProperties.Zip(yProperties))
                    {
                        if (px.Name != py.Name)
                            return false;
                        if (!Equals(px.Value, py.Value))
                            return false;
                    }
                    return true;
                }
                
            default:
                throw new JsonException(string.Format("Unknown JsonValueKind {0}", x.ValueKind));
        }
    }

    public int GetHashCode(JsonElement obj)
    {
        var hash = new HashCode(); // New in .Net core: https://learn.microsoft.com/en-us/dotnet/api/system.hashcode
        ComputeHashCode(obj, ref hash, 0);
        return hash.ToHashCode();
    }

    void ComputeHashCode(JsonElement obj, ref HashCode hash, int depth)
    {
        hash.Add(obj.ValueKind);

        switch (obj.ValueKind)
        {
            case JsonValueKind.Null:
            case JsonValueKind.True:
            case JsonValueKind.False:
            case JsonValueKind.Undefined:
                break;
                
            case JsonValueKind.Number:
                hash.Add(obj.GetRawText());
                break;

            case JsonValueKind.String:
                hash.Add(obj.GetString());
                break;
                
            case JsonValueKind.Array:
                if (depth != MaxHashDepth)
                    foreach (var item in obj.EnumerateArray())
                        ComputeHashCode(item, ref hash, depth+1);
                else
                    hash.Add(obj.GetArrayLength());
                break;
            
            case JsonValueKind.Object:
                foreach (var property in obj.EnumerateObject().OrderBy(p => p.Name, StringComparer.Ordinal))
                {
                    hash.Add(property.Name);
                    if (depth != MaxHashDepth)
                        ComputeHashCode(property.Value, ref hash, depth+1);
                }
                break;
                
            default:
                throw new JsonException(string.Format("Unknown JsonValueKind {0}", obj.ValueKind));
        }            
    }
    
    #endregion
}

Use it as follows:

var comparer = new JsonElementComparer();
using var doc1 = System.Text.Json.JsonDocument.Parse(referenceJson);
using var doc2 = System.Text.Json.JsonDocument.Parse(resultJson);
Assert.IsTrue(comparer.Equals(doc1.RootElement, doc2.RootElement));

Notes:

  • Since Json.NET resolves floating-point JSON values to double or decimal during parsing, JToken.DeepEquals() considers floating-point values that differ only in trailing zeros to be identical. I.e. the following assertion passes:

    Assert.IsTrue(JToken.DeepEquals(JToken.Parse("1.0"), JToken.Parse("1.00")));
    

    My comparer does not consider these two be equal. I consider this to be desirable because applications sometimes want to preserve trailing zeros, e.g. when deserializing to decimal, and thus this difference may sometimes matter. (For an example see e.g. *Json.Net not serializing decimals the same way twice) If you want to consider such JSON values to be identical, you will need to modify the cases for JsonValueKind.Number in ComputeHashCode() and Equals(JsonElement x, JsonElement y) to trim trailing zeros when present after a decimal point.

  • Making the above harder is the fact that, surprisingly, JsonDocument fully supports duplicate property names! I.e. it's perfectly happy to parse {"Value":"a", "Value" : "b"} and will store both key/value pairs inside the document.

    A close reading of https://www.rfc-editor.org/rfc/rfc8259#section-4 seems to indicate that such objects are allowed but not recommended, and when they arise, interpretation of identically-named properties may be order-dependent. I handled this by stably sorting the property lists by property name, then walking the lists and comparing names and values. If you don't care about duplicate property names, you could probably improve the performance by using single lookup dictionary instead of two sorted lists.

  • JsonDocument is disposable, and in fact needs to be disposed of according to the docs:

    This class utilizes resources from pooled memory to minimize the impact of the garbage collector (GC) in high-usage scenarios. Failure to properly dispose this object will result in the memory not being returned to the pool, which will increase GC impact across various parts of the framework.

    In your question you do not do this, but you should.

  • There is currently an open enhancement System.Text.Json: add ability to do semantic comparisons of JSON values à la JToken.DeepEquals() #33388, to which the development team replied, "this isn't on our roadmap right now."

Demo fiddle here.

like image 190
dbc Avatar answered Feb 06 '23 10:02

dbc