Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Value-equals and circular references: how to resolve infinite recursion?

I have some classes that contain several fields. I need to compare them by value, i.e. two instances of a class are equal if their fields contain the same data. I have overridden the GetHashCode and Equals methods for that.

It can happen that these classes contain circular references.

Example: We want to model institutions (like government, sports clubs, whatever). An institution has a name. A Club is an institution that has a name and a list of members. Each member is a Person that has a name and a favourite institution. If a member of a certain club has this club as his favourite institution, we have a circular reference.

But circular references, in conjunction with value equality, lead to infinite recursion. Here is a code example:

interface IInstitution { string Name { get; } }

class Club : IInstitution
{
    public string Name { get; set; }
    public HashSet<Person> Members { get; set; }

    public override int GetHashCode() { return Name.GetHashCode() + Members.Count; }

    public override bool Equals(object obj)
    {
        Club other = obj as Club;
        if (other == null)
            return false;

        return Name.Equals(other.Name) && Members.SetEquals(other.Members);
    }
}

class Person
{
    public string Name { get; set; }
    public IInstitution FavouriteInstitution { get; set; }

    public override int GetHashCode() { return Name.GetHashCode(); }

    public override bool Equals(object obj)
    {
        Person other = obj as Person;
        if (other == null)
            return false;

        return Name.Equals(other.Name)
            && FavouriteInstitution.Equals(other.FavouriteInstitution);
    }
}

class Program
{
    public static void Main()
    {
        Club c1 = new Club { Name = "myClub", Members = new HashSet<Person>() };
        Person p1 = new Person { Name = "Johnny", FavouriteInstitution = c1 }
        c1.Members.Add(p1);

        Club c2 = new Club { Name = "myClub", Members = new HashSet<Person>() };
        Person p2 = new Person { Name = "Johnny", FavouriteInstitution = c2 }
        c2.Members.Add(p2);

        bool c1_and_c2_equal = c1.Equals(c2); // StackOverflowException!
            // c1.Equals(c2) calls Members.SetEquals(other.Members)
            // Members.SetEquals(other.Members) calls p1.Equals(p2)
            // p1.Equals(p2) calls c1.Equals(c2) 
    }
}

c1_and_c2_equal should return true, and in fact we (humans) can see that they are value-equal with a little bit of thinking, without running into infinite recursion. However, I can't really say how we figure that out. But since it is possible, I hope that there is a way to resolve this problem in code as well!

So the question is: How can I check for value equality without running into infinite recursions?

Note that I need to resolve circular references in general, not only the case from above. I'll call it a 2-circle since c1 references p1, and p1 references c1. There can be other n-circles, e.g. if a club A has a member M whose favourite is club B which has member N whose favourite club is A. That would be a 4-circle. Other object models might also allow n-circles with odd numbers n. I am looking for a way to resolve all these problems at once, since I won't know in advance which value n can have.

like image 597
Kjara Avatar asked Oct 05 '17 13:10

Kjara


2 Answers

An easy workaround (used in RDBMS) is to use a unique Id to identify a Person(any type). Then you don't need to compare every other property and you never run into such cuircular references.

Another way is to compare differently in Equals, so provide the deep check only for the type of the Equals and not for the referenced types. You could use a custom comparer:

public class PersonNameComparer : IEqualityComparer<Person>
{
    public bool Equals(Person x, Person y)
    {
        if (x == null && y == null) return true;
        if (x == null || y == null) return false;
        if(object.ReferenceEquals(x, y)) return true;
        return x.Name == y.Name;
    }

    public int GetHashCode(Person obj)
    {
        return obj?.Name?.GetHashCode() ?? int.MinValue;
    }
}

Now you can change the Equals implementation of Club to avoid that the Members(Persons) will use their deep check which includes the institution but only their Name:

public override bool Equals(object obj)
{
    if (Object.ReferenceEquals(this, obj))
        return true;

    Club other = obj as Club;
    if (other == null)
        return false;

    var personNameComparer = new PersonNameComparer();
    return Name.Equals(other.Name) 
        && Members.Count == other.Members.Count 
        && !Members.Except(other.Members, personNameComparer).Any();
}

You notice that i can't use SetEquals because there is no overload for my custom comparer.

like image 195
Tim Schmelter Avatar answered Oct 03 '22 14:10

Tim Schmelter


Following the suggestion of Dryadwoods, I changed the Equals methods so that I can keep track of the items that were already compared.

First we need an equality comparer that checks reference equality for corresponding elements of pairs:

public class ValuePairRefEqualityComparer<T> : IEqualityComparer<(T,T)> where T : class
{
    public static ValuePairRefEqualityComparer<T> Instance
        = new ValuePairRefEqualityComparer<T>();
    private ValuePairRefEqualityComparer() { }

    public bool Equals((T,T) x, (T,T) y)
    {
        return ReferenceEquals(x.Item1, y.Item1)
            && ReferenceEquals(x.Item2, y.Item2);
    }

    public int GetHashCode((T,T) obj)
    {
        return RuntimeHelpers.GetHashCode(obj.Item1)
            + 2 * RuntimeHelpers.GetHashCode(obj.Item2);
    }
}

And here is the modified Equals method of Club:

static HashSet<(Club,Club)> checkedPairs
    = new HashSet<(Club,Club)>(ValuePairRefEqualityComparer<Club>.Instance);

public override bool Equals(object obj)
{
    Club other = obj as Club;
    if (other == null)
        return false;

    if (!Name.Equals(other.Name))
        return;

    if (checkedPairs.Contains((this,other)) || checkedPairs.Contains((other,this)))
        return true;

    checkedPairs.Add((this,other));

    bool membersEqual = Members.SetEquals(other.Members);
    checkedPairs.Clear();
    return membersEqual;
}

The version for Person is analogous. Note that I add (this,other) to checkedPairs and check if either (this,other) or (other,this) is contained because it might happen that after the first call of c1.Equals(c2), we end up with a call of c2.Equals(c1) instead of c1.Equals(c2). I am not sure if this actually happens, but since I can't see the implementation of SetEquals, I believe it is a possibility.

Since I am not happy with using a static field for the already checked pairs (it will not work if the program is concurrent!), I asked another question: make a variable last for a call stack.

like image 25
Kjara Avatar answered Oct 03 '22 12:10

Kjara