Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Except has similar effect to Distinct?

Tags:

I just discovered that Except() will remove all elements in the second list from the first, but it also has the effect that it makes all elements in the returned result distinct.

Simple way around I am using is Where(v => !secondList.Contains(v))

Can anyone explain to me why this is the behavior, and if possible point me to the documentation that explains this?

like image 732
Stephan Avatar asked Jun 04 '10 16:06

Stephan


2 Answers

The documentation for the Except function states:

Produces the set difference of two sequences by using the default equality comparer to compare values.

The set difference of two sets is defined as the members of the first set that do not appear in the second set.

The important word here is set, which is defined as:

...an abstract data structure that can store certain values, without any particular order, and no repeated values...

Because Except is documented as a set-based operation, it also has the effect of making the resulting values distinct.

like image 122
Greg Beech Avatar answered Sep 23 '22 15:09

Greg Beech


You wrote:

Simple way around I am using is Where(v => !secondList.Contains(v))

When you do this, there is still Distict done with secondList.

For example:

var firstStrings = new [] { "1", null, null, null, "3", "3" }; var secondStrings = new [] { "1", "1", "1", null, null, "4" }; var resultStrings = firstStrings.Where(v => !secondStrings.Contains(v)); // 3, 3   

I created an extension method to have no distinct at all. Examle of usage:

var result2Strings = firstStrings.ExceptAll(secondStrings).ToList(); // null, 3, 3 

This is what it does:

enter image description here

This is the source:

public static IEnumerable<TSource> ExceptAll<TSource>(     this IEnumerable<TSource> first,     IEnumerable<TSource> second) {     // Do not call reuse the overload method because that is a slower imlementation     if (first == null) { throw new ArgumentNullException("first"); }     if (second == null) { throw new ArgumentNullException("second"); }      var secondList = second.ToList();     return first.Where(s => !secondList.Remove(s)); }  public static IEnumerable<TSource> ExceptAll<TSource>(     this IEnumerable<TSource> first,     IEnumerable<TSource> second,     IEqualityComparer<TSource> comparer) {     if (first == null) { throw new ArgumentNullException("first"); }     if (second == null) { throw new ArgumentNullException("second"); }     var comparerUsed = comparer ?? EqualityComparer<TSource>.Default;      var secondList = second.ToList();     foreach (var item in first)     {         if (secondList.Contains(item, comparerUsed))         {             secondList.Remove(item);         }         else         {             yield return item;         }     } } 

Edit: A faster implemetation, based on the comment of DigEmAll

public static IEnumerable<TSource> ExceptAll<TSource>(         this IEnumerable<TSource> first,         IEnumerable<TSource> second) {     return ExceptAll(first, second, null); }  public static IEnumerable<TSource> ExceptAll<TSource>(     this IEnumerable<TSource> first,     IEnumerable<TSource> second,     IEqualityComparer<TSource> comparer) {     if (first == null) { throw new ArgumentNullException("first"); }     if (second == null) { throw new ArgumentNullException("second"); }       var secondCounts = new Dictionary<TSource, int>(comparer ?? EqualityComparer<TSource>.Default);     int count;     int nullCount = 0;      // Count the values from second     foreach (var item in second)     {         if (item == null)         {             nullCount++;         }         else         {             if (secondCounts.TryGetValue(item, out count))             {                 secondCounts[item] = count + 1;             }             else             {                 secondCounts.Add(item, 1);             }          }     }      // Yield the values from first     foreach (var item in first)     {         if (item == null)         {             nullCount--;             if (nullCount < 0)             {                 yield return item;             }          }         else         {             if (secondCounts.TryGetValue(item, out count))             {                 if (count == 0)                 {                     secondCounts.Remove(item);                     yield return item;                 }                 else                 {                     secondCounts[item] = count - 1;                 }             }             else             {                 yield return item;             }         }     } } 

More info on my blog (also variant for Intersect and Union)

like image 21
Alex Siepman Avatar answered Sep 25 '22 15:09

Alex Siepman