I have the following code:
var foo = (from data in pivotedData.AsEnumerable()
select new
{
Group = data.Field<string>("Group_Number"),
Study = data.Field<string>("Study_Name")
}).Distinct();
As expected this returns distinct values. However, what I want is to return a strongly-typed collection as opposed to an anonymous type, so when I do:
var foo = (from data in pivotedData.AsEnumerable()
select new BarObject
{
Group = data.Field<string>("Group_Number"),
Study = data.Field<string>("Study_Name")
}).Distinct();
This does not return the distinct values, it returns them all. Is there a way to do this with actual objects?
For Distinct()
(and many other LINQ features) to work, the class being compared (BarObject
in your example) must implement implement Equals()
and GetHashCode()
, or alternatively provide a separate IEqualityComparer<T>
as an argument to Distinct()
.
Many LINQ methods take advantage of GetHashCode()
for performance because internally they will use things like a Set<T>
to hold the unique items, which uses hashing for O(1) lookups. Also, GetHashCode()
can quickly tell you if two objects may be equivalent and which ones are definitely not - as long as GetHashCode()
is properly implemented of course.
So you should make all your classes you intend to compare in LINQ implement Equals()
and GetHashCode()
for completeness, or create a separate IEqualityComparer<T>
implementation.
Either do as dlev suggested or use:
var foo = (from data in pivotedData.AsEnumerable()
select new BarObject
{
Group = data.Field<string>("Group_Number"),
Study = data.Field<string>("Study_Name")
}).GroupBy(x=>x.Group).Select(x=>x.FirstOrDefault())
Check this out for more info http://blog.jordanterrell.com/post/LINQ-Distinct()-does-not-work-as-expected.aspx
You need to override Equals
and GetHashCode
for BarObject
because the EqualityComparer.Default<BarObject>
is reference equality unless you have provided overrides of Equals
and GetHashCode
(this is what Enumerable.Distinct<BarObject>(this IEnumerable<BarObject> source)
uses). Alternatively, you can pass in an IEqualityComparer<BarObject>
to Enumerable.Distinct<BarObject>(this IEnumerable<BarObject>, IEqualityComparer<BarObject>)
.
Looks like Distinct
can not compare your BarObject
objects. Therefore it compares their references, which of course are all different from each other, even if they have the same contents.
So either you overwrite the Equals
method, or you supply a custom EqualityComparer to Distinct
. Remember to overwrite GetHashCode
when you implement Equals
, otherwise it will produce strange results if you put your objects for example into a dictionary or hashtable as key (e.g. HashSet<BarObject>
). It might be (don't know exactly) that Distinct internally uses a hashset.
Here is a collection of good practices for GetHashCode
.
You want to use the other overload for Distinct() that takes a comparer. You can then implement your own IEqualityComparer<BarObject>.
Try this:
var foo = (from data in pivotedData.AsEnumerable().Distinct()
select new BarObject
{
Group = data.Field<string>("Group_Number"),
Study = data.Field<string>("Study_Name")
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With