Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why does Enumerable.Except returns DISTINCT items?

Tags:

c#

.net

linq

Having just spent over an hour debugging a bug in our code which in the end turned out to be something about the Enumerable.Except method which we didn't know about:

var ilist = new[] { 1, 1, 1, 1 };
var ilist2 = Enumerable.Empty<int>();
ilist.Except(ilist2); // returns { 1 } as opposed to { 1, 1, 1, 1 }

or more generally:

var ilist3 = new[] { 1 };
var ilist4 = new[] { 1, 1, 2, 2, 3 };
ilist4.Except(ilist3); // returns { 2, 3 } as opposed to { 2, 2, 3 }

Looking at the MSDN page:

This method returns those elements in first that do not appear in second. It does not also return those elements in second that do not appear in first.

I get it that in cases like this:

var ilist = new[] { 1, 1, 1, 1 };
var ilist2 = new[] { 1 };
ilist.Except(ilist2); // returns an empty array

you get the empty array because every element in the first array 'appears' in the second and therefore should be removed.

But why do we only get distinct instances of all other items that do not appear in the second array? What's the rationale behind this behaviour?

like image 355
theburningmonk Avatar asked Dec 20 '10 18:12

theburningmonk


People also ask

How does Linq distinct work?

C# Linq Distinct() method removes the duplicate elements from a sequence (list) and returns the distinct elements from a single data source. It comes under the Set operators' category in LINQ query operators, and the method works the same way as the DISTINCT directive in Structured Query Language (SQL).

Does distinct preserve order?

Java Stream distinct() MethodIf the stream is ordered, the encounter order is preserved. It means that the element occurring first will be present in the distinct elements stream.

How does except work in c#?

The Except() method requires two collections. It returns a new collection with elements from the first collection which do not exist in the second collection (parameter collection). Except extension method doesn't return the correct result for the collection of complex types.


1 Answers

I certainly cannot say for sure why they decided to do it that way. However, I'll give it a shot.

MSDN describes Except as this:

Produces the set difference of two sequences by using the default equality comparer to compare values.

A Set is described as this:

A set is a collection of distinct objects, considered as an object in its own right

like image 101
Mike M. Avatar answered Oct 12 '22 10:10

Mike M.