Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ's Distinct() on a particular property

I am playing with LINQ to learn about it, but I can't figure out how to use Distinct when I do not have a simple list (a simple list of integers is pretty easy to do, this is not the question). What I if want to use Distinct on a list of an Object on one or more properties of the object?

Example: If an object is Person, with Property Id. How can I get all Person and use Distinct on them with the property Id of the object?

Person1: Id=1, Name="Test1" Person2: Id=1, Name="Test1" Person3: Id=2, Name="Test2" 

How can I get just Person1 and Person3? Is that possible?

If it's not possible with LINQ, what would be the best way to have a list of Person depending on some of its properties in .NET 3.5?

like image 583
Patrick Desjardins Avatar asked Jan 28 '09 20:01

Patrick Desjardins


People also ask

How do I get distinct on a single column in LINQ?

distinct in Linq to get result based on one field of the table (so do not require a whole duplicated records from table). I know writing basic query using distinct as followed: var query = (from r in table1 orderby r. Text select r).

Why distinct is not working in Linq?

LINQ Distinct is not that smart when it comes to custom objects. All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields). One workaround is to implement the IEquatable interface as shown here.

What does distinct do in Linq?

C# Linq Distinct() method removes the duplicate elements from a sequence (list) and returns the distinct elements from a single data source. It comes under the Set operators' category in LINQ query operators, and the method works the same way as the DISTINCT directive in Structured Query Language (SQL).


2 Answers

What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople   .GroupBy(p => p.PersonId)   .Select(g => g.First())   .ToList(); 

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople   .GroupBy(p => new {p.PersonId, p.FavoriteColor} )   .Select(g => g.First())   .ToList(); 

Note: Certain query providers are unable to resolve that each group must have at least one element, and that First is the appropriate method to call in that situation. If you find yourself working with such a query provider, FirstOrDefault may help get your query through the query provider.

Note2: Consider this answer for an EF Core (prior to EF Core 6) compatible approach. https://stackoverflow.com/a/66529949/8155

like image 78
Amy B Avatar answered Oct 11 '22 10:10

Amy B


EDIT: This is now part of MoreLINQ.

What you need is a "distinct-by" effectively. I don't believe it's part of LINQ as it stands, although it's fairly easy to write:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>     (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) {     HashSet<TKey> seenKeys = new HashSet<TKey>();     foreach (TSource element in source)     {         if (seenKeys.Add(keySelector(element)))         {             yield return element;         }     } } 

So to find the distinct values using just the Id property, you could use:

var query = people.DistinctBy(p => p.Id); 

And to use multiple properties, you can use anonymous types, which implement equality appropriately:

var query = people.DistinctBy(p => new { p.Id, p.Name }); 

Untested, but it should work (and it now at least compiles).

It assumes the default comparer for the keys though - if you want to pass in an equality comparer, just pass it on to the HashSet constructor.

like image 42
Jon Skeet Avatar answered Oct 11 '22 11:10

Jon Skeet