I just discovered that Except()
will remove all elements in the second list from the first, but it also has the effect that it makes all elements in the returned result distinct.
Simple way around I am using is Where(v => !secondList.Contains(v))
Can anyone explain to me why this is the behavior, and if possible point me to the documentation that explains this?
The documentation for the Except
function states:
Produces the set difference of two sequences by using the default equality comparer to compare values.
The set difference of two sets is defined as the members of the first set that do not appear in the second set.
The important word here is set, which is defined as:
...an abstract data structure that can store certain values, without any particular order, and no repeated values...
Because Except
is documented as a set-based operation, it also has the effect of making the resulting values distinct.
You wrote:
Simple way around I am using is
Where(v => !secondList.Contains(v))
When you do this, there is still Distict done with secondList
.
For example:
var firstStrings = new [] { "1", null, null, null, "3", "3" }; var secondStrings = new [] { "1", "1", "1", null, null, "4" }; var resultStrings = firstStrings.Where(v => !secondStrings.Contains(v)); // 3, 3
I created an extension method to have no distinct at all. Examle of usage:
var result2Strings = firstStrings.ExceptAll(secondStrings).ToList(); // null, 3, 3
This is what it does:
This is the source:
public static IEnumerable<TSource> ExceptAll<TSource>( this IEnumerable<TSource> first, IEnumerable<TSource> second) { // Do not call reuse the overload method because that is a slower imlementation if (first == null) { throw new ArgumentNullException("first"); } if (second == null) { throw new ArgumentNullException("second"); } var secondList = second.ToList(); return first.Where(s => !secondList.Remove(s)); } public static IEnumerable<TSource> ExceptAll<TSource>( this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer) { if (first == null) { throw new ArgumentNullException("first"); } if (second == null) { throw new ArgumentNullException("second"); } var comparerUsed = comparer ?? EqualityComparer<TSource>.Default; var secondList = second.ToList(); foreach (var item in first) { if (secondList.Contains(item, comparerUsed)) { secondList.Remove(item); } else { yield return item; } } }
Edit: A faster implemetation, based on the comment of DigEmAll
public static IEnumerable<TSource> ExceptAll<TSource>( this IEnumerable<TSource> first, IEnumerable<TSource> second) { return ExceptAll(first, second, null); } public static IEnumerable<TSource> ExceptAll<TSource>( this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer) { if (first == null) { throw new ArgumentNullException("first"); } if (second == null) { throw new ArgumentNullException("second"); } var secondCounts = new Dictionary<TSource, int>(comparer ?? EqualityComparer<TSource>.Default); int count; int nullCount = 0; // Count the values from second foreach (var item in second) { if (item == null) { nullCount++; } else { if (secondCounts.TryGetValue(item, out count)) { secondCounts[item] = count + 1; } else { secondCounts.Add(item, 1); } } } // Yield the values from first foreach (var item in first) { if (item == null) { nullCount--; if (nullCount < 0) { yield return item; } } else { if (secondCounts.TryGetValue(item, out count)) { if (count == 0) { secondCounts.Remove(item); yield return item; } else { secondCounts[item] = count - 1; } } else { yield return item; } } } }
More info on my blog (also variant for Intersect and Union)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With