Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to check contents of collections(>2) are same

Tags:

c#

linq

I have a List. For valid reasons, I duplicate the List many times and use it for different purposes. At some point I need to check if the contents of all these collections are same.

Well, I know how to do this. But being a fan of "short hand" coding(linq...) I would like to know if I can check this EFFICIENTLY with the shortest number of lines of code.

    List<string> original, duplicate1, duplicate2, duplicate3, duplicate4
                                       = new List<string();

        //...some code.....
        bool isequal = duplicate4.sequenceequal(duplicate3) 
         && duplicate3.sequenceequal(duplicate2)
         && duplicate2.sequenceequal(duplicate1) 
         && duplicate1.sequenceequal(original);//can we do it better than this 

UPDATE

Codeinchaos pointed out certain senarios I havent thought of(duplicates and order of list).Though sequenceequal will take care of duplicates the order of the list can be a problem. So I am changing the code as follows. I need to copy the Lists for this.

List<List<string>> copy = new List<List<int>> { duplicate1, duplicate2,  
                                                 duplicate3, duplicate4 }; 
bool iseqaul  = (original.All(x => (copy.All(y => y.Remove(x))))
                                         && copy.All(n => n.Count == 0)); 

UPDATE2

Thanks to Eric-using a HashSet can be very efficient as follows. This wont cover duplicates though.

List<HashSet<string>> copy2 =new List<HashSet<string>>{new HashSet<string>(duplicate1),
                                                       new HashSet<string>(duplicate2),
                                                       new HashSet<string> duplicate3),
                                                       new HashSet<string>(duplicate4)};
  HashSet<string> origninalhashset = new HashSet<string>(original);
  bool eq = copy2.All(x => origninalhashset.SetEquals(x));

UPDATE3 Thanks to Eric - The original code in this post with SequenceEqual will work with sorting. As Sequenceequal will consider the order of collections, the collections need to be sorted before calling sequenceequal. I guess this is not much of a probelm as sorting is pretty fast(nlogn).

UPDATE4 As per Brian's suggestion, I can use a lookup for this.

var originallkup = original.ToLookup(i => i);    
var lookuplist = new List<ILookup<int, int>>
                                    {   duplicate4.ToLookup(i=>  i), 
                                        duplicate3.ToLookup(i=>  i), 
                                        duplicate2.ToLookup(i=>  i),
                                        duplicate1.ToLookup(i=>  i)
                                    };

bool isequal = (lookuplist.Sum(x => x.Count) == (originallkup.Count * 4)) &&       
   (originallkup.All(x => lookuplist.All(i => i[x.Key].Count() == x.Count())));

Thank you all for your responses.

like image 915
Jimmy Avatar asked Feb 09 '12 12:02

Jimmy


2 Answers

I have a List. I duplicate the List many times and use it for different purposes. At some point I need to check if the contents of all these collections are same.

A commenter then asks:

Is the order important? Or just the content?

And you respond:

only the content is important

In that case you are using the wrong data structure in the first place. Use a HashSet<T>, not a List<T>, to represent an unordered collection of items that must be cheaply compared for set equality.

Once you have everything in hash sets instead of lists, you can simply use their SetEquals method to see if any pair of sets is unequal.

Alternatively: keep everything in lists, until the point where you want to compare for equality. Initialize a hash set from one of the lists, and then use SetEquals to compare that hash set to every other list.

like image 193
Eric Lippert Avatar answered Sep 29 '22 01:09

Eric Lippert


I honestly can't think of a more efficient solution, but as for reducing the number of lines of code, give this a bash:

var allLists = new List<List<string>>() { original, duplicate1, duplicate2, duplicate3, duplicate4 };

bool allEqual = allLists.All(l => l.SequenceEqual(original));

Or, use the Any operator - might be better in terms of performance.

bool allEqual = !allLists.Any(l => !l.SequenceEqual(original));

EDIT: Confirmed, Any will stop enumerating the source once it determines a value. Thank you MSDN.

EDIT # 2: I have been looking into the performance of SequenceEquals. This guy has a nice post comparing SequenceEquals to a more imperative function. I modified his example to work with List<string> and my findings match his. It would appear that as far as performance is concerned, SequenceEquals isn't high on the list of preferred methods.

like image 29
tobias86 Avatar answered Sep 29 '22 02:09

tobias86