Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing two collections for equality irrespective of the order of items in them

People also ask

What is used to compare two data collections?

A double bar graph can be used to compare two collections of data.

How do you compare equality of objects?

The behavior for performing loose equality using == is as follows: If the operands have the same type, they are compared as follows: Object: return true only if both operands reference the same object. String: return true only if both operands have the same characters in the same order.

How do you know if two lists are equal in darts?

dart has a listEquals method, does compare each element by element in the list, and returns equal comparison result with boolean type. true: Returns true if two lists are equal with ignoring the order of elements. false: Elements in the list.


It turns out Microsoft already has this covered in its testing framework: CollectionAssert.AreEquivalent

Remarks

Two collections are equivalent if they have the same elements in the same quantity, but in any order. Elements are equal if their values are equal, not if they refer to the same object.

Using reflector, I modified the code behind AreEquivalent() to create a corresponding equality comparer. It is more complete than existing answers, since it takes nulls into account, implements IEqualityComparer and has some efficiency and edge case checks. plus, it's Microsoft :)

public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    private readonly IEqualityComparer<T> m_comparer;
    public MultiSetComparer(IEqualityComparer<T> comparer = null)
    {
        m_comparer = comparer ?? EqualityComparer<T>.Default;
    }

    public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
    {
        if (first == null)
            return second == null;

        if (second == null)
            return false;

        if (ReferenceEquals(first, second))
            return true;

        if (first is ICollection<T> firstCollection && second is ICollection<T> secondCollection)
        {
            if (firstCollection.Count != secondCollection.Count)
                return false;

            if (firstCollection.Count == 0)
                return true;
        }

        return !HaveMismatchedElement(first, second);
    }

    private bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
    {
        int firstNullCount;
        int secondNullCount;

        var firstElementCounts = GetElementCounts(first, out firstNullCount);
        var secondElementCounts = GetElementCounts(second, out secondNullCount);

        if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
            return true;

        foreach (var kvp in firstElementCounts)
        {
            var firstElementCount = kvp.Value;
            int secondElementCount;
            secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);

            if (firstElementCount != secondElementCount)
                return true;
        }

        return false;
    }

    private Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
    {
        var dictionary = new Dictionary<T, int>(m_comparer);
        nullCount = 0;

        foreach (T element in enumerable)
        {
            if (element == null)
            {
                nullCount++;
            }
            else
            {
                int num;
                dictionary.TryGetValue(element, out num);
                num++;
                dictionary[element] = num;
            }
        }

        return dictionary;
    }

    public int GetHashCode(IEnumerable<T> enumerable)
    {
        if (enumerable == null) throw new 
            ArgumentNullException(nameof(enumerable));

        int hash = 17;

        foreach (T val in enumerable)
            hash ^= (val == null ? 42 : m_comparer.GetHashCode(val));

        return hash;
    }
}

Sample usage:

var set = new HashSet<IEnumerable<int>>(new[] {new[]{1,2,3}}, new MultiSetComparer<int>());
Console.WriteLine(set.Contains(new [] {3,2,1})); //true
Console.WriteLine(set.Contains(new [] {1, 2, 3, 3})); //false

Or if you just want to compare two collections directly:

var comp = new MultiSetComparer<string>();
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","c","b"})); //true
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","b"})); //false

Finally, you can use your an equality comparer of your choice:

var strcomp = new MultiSetComparer<string>(StringComparer.OrdinalIgnoreCase);
Console.WriteLine(strcomp.Equals(new[] {"a", "b"}, new []{"B", "A"})); //true

A simple and fairly efficient solution is to sort both collections and then compare them for equality:

bool equal = collection1.OrderBy(i => i).SequenceEqual(
                 collection2.OrderBy(i => i));

This algorithm is O(N*logN), while your solution above is O(N^2).

If the collections have certain properties, you may be able to implement a faster solution. For example, if both of your collections are hash sets, they cannot contain duplicates. Also, checking whether a hash set contains some element is very fast. In that case an algorithm similar to yours would likely be fastest.


Create a Dictionary "dict" and then for each member in the first collection, do dict[member]++;

Then, loop over the second collection in the same way, but for each member do dict[member]--.

At the end, loop over all of the members in the dictionary:

    private bool SetEqual (List<int> left, List<int> right) {

        if (left.Count != right.Count)
            return false;

        Dictionary<int, int> dict = new Dictionary<int, int>();

        foreach (int member in left) {
            if (dict.ContainsKey(member) == false)
                dict[member] = 1;
            else
                dict[member]++;
        }

        foreach (int member in right) {
            if (dict.ContainsKey(member) == false)
                return false;
            else
                dict[member]--;
        }

        foreach (KeyValuePair<int, int> kvp in dict) {
            if (kvp.Value != 0)
                return false;
        }

        return true;

    }

Edit: As far as I can tell this is on the same order as the most efficient algorithm. This algorithm is O(N), assuming that the Dictionary uses O(1) lookups.


This is my (heavily influenced by D.Jennings) generic implementation of the comparison method (in C#):

/// <summary>
/// Represents a service used to compare two collections for equality.
/// </summary>
/// <typeparam name="T">The type of the items in the collections.</typeparam>
public class CollectionComparer<T>
{
    /// <summary>
    /// Compares the content of two collections for equality.
    /// </summary>
    /// <param name="foo">The first collection.</param>
    /// <param name="bar">The second collection.</param>
    /// <returns>True if both collections have the same content, false otherwise.</returns>
    public bool Execute(ICollection<T> foo, ICollection<T> bar)
    {
        // Declare a dictionary to count the occurence of the items in the collection
        Dictionary<T, int> itemCounts = new Dictionary<T,int>();

        // Increase the count for each occurence of the item in the first collection
        foreach (T item in foo)
        {
            if (itemCounts.ContainsKey(item))
            {
                itemCounts[item]++;
            }
            else
            {
                itemCounts[item] = 1;
            }
        }

        // Wrap the keys in a searchable list
        List<T> keys = new List<T>(itemCounts.Keys);

        // Decrease the count for each occurence of the item in the second collection
        foreach (T item in bar)
        {
            // Try to find a key for the item
            // The keys of a dictionary are compared by reference, so we have to
            // find the original key that is equivalent to the "item"
            // You may want to override ".Equals" to define what it means for
            // two "T" objects to be equal
            T key = keys.Find(
                delegate(T listKey)
                {
                    return listKey.Equals(item);
                });

            // Check if a key was found
            if(key != null)
            {
                itemCounts[key]--;
            }
            else
            {
                // There was no occurence of this item in the first collection, thus the collections are not equal
                return false;
            }
        }

        // The count of each item should be 0 if the contents of the collections are equal
        foreach (int value in itemCounts.Values)
        {
            if (value != 0)
            {
                return false;
            }
        }

        // The collections are equal
        return true;
    }
}

You could use a Hashset. Look at the SetEquals method.


static bool SetsContainSameElements<T>(IEnumerable<T> set1, IEnumerable<T> set2) {
    var setXOR = new HashSet<T>(set1);
    setXOR.SymmetricExceptWith(set2);
    return (setXOR.Count == 0);
}

Solution requires .NET 3.5 and the System.Collections.Generic namespace. According to Microsoft, SymmetricExceptWith is an O(n + m) operation, with n representing the number of elements in the first set and m representing the number of elements in the second. You could always add an equality comparer to this function if necessary.


If you use Shouldly, you can use ShouldAllBe with Contains.

collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};

collection1.ShouldAllBe(item=>collection2.Contains(item)); // true

And finally, you can write an extension.

public static class ShouldlyIEnumerableExtensions
{
    public static void ShouldEquivalentTo<T>(this IEnumerable<T> list, IEnumerable<T> equivalent)
    {
        list.ShouldAllBe(l => equivalent.Contains(l));
    }
}

UPDATE

A optional parameter exists on ShouldBe method.

collection1.ShouldBe(collection2, ignoreOrder: true); // true