Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is there no Linq method to return distinct values by a predicate?

Tags:

c#

linq

distinct

I want to get the distinct values in a list, but not by the standard equality comparison.

What I want to do is something like this:

return myList.Distinct( (x, y) => x.Url == y.Url ); 

I can't, there's no extension method in Linq that will do this - just one that takes an IEqualityComparer.

I can hack around it with this:

return myList.GroupBy( x => x.Url ).Select( g => g.First() ); 

But that seems messy. It also doesn't quite do the same thing - I can only use it here because I have a single key.

I could also add my own:

public static IEnumerable<T> Distinct<T>(      this IEnumerable<T> input, Func<T,T,bool> compare ) {     //write my own here } 

But that does seem rather like writing something that should be there in the first place.

Anyone know why this method isn't there?

Am I missing something?

like image 204
Keith Avatar asked Feb 06 '09 11:02

Keith


People also ask

Why distinct is not working in Linq?

This is because the var temp = books. SelectMany(book => book. Authors). Distinct(); returns an IEnumerable , meaning that the request is not executed right away, it is only executed when the data is used.

How does Linq distinct work?

LINQ Distinct operator removes all the duplicate values from the collection and finally returns the dissimilar or unique values. The LINQ Distinct operator available in only Method Syntax and it not supports the Query Syntax. LINQ Distinct is an operator which comes under Set Operator.

How do I get distinct on a single column in LINQ?

distinct in Linq to get result based on one field of the table (so do not require a whole duplicated records from table). I know writing basic query using distinct as followed: var query = (from r in table1 orderby r. Text select r).


2 Answers

It's annoying, certainly. It's also part of my "MoreLINQ" project which I must pay some attention to at some point :) There are plenty of other operations which make sense when acting on a projection, but returning the original - MaxBy and MinBy spring to mind.

As you say, it's easy to write - although I prefer the name "DistinctBy" to match OrderBy etc. Here's my implementation if you're interested:

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>         (this IEnumerable<TSource> source,          Func<TSource, TKey> keySelector)     {         return source.DistinctBy(keySelector,                                  EqualityComparer<TKey>.Default);     }      public static IEnumerable<TSource> DistinctBy<TSource, TKey>         (this IEnumerable<TSource> source,          Func<TSource, TKey> keySelector,          IEqualityComparer<TKey> comparer)     {         if (source == null)         {             throw new ArgumentNullException("source");         }         if (keySelector == null)         {             throw new ArgumentNullException("keySelector");         }         if (comparer == null)         {             throw new ArgumentNullException("comparer");         }         return DistinctByImpl(source, keySelector, comparer);     }      private static IEnumerable<TSource> DistinctByImpl<TSource, TKey>         (IEnumerable<TSource> source,          Func<TSource, TKey> keySelector,          IEqualityComparer<TKey> comparer)     {         HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);         foreach (TSource element in source)         {             if (knownKeys.Add(keySelector(element)))             {                 yield return element;             }         }     } 
like image 182
Jon Skeet Avatar answered Oct 20 '22 00:10

Jon Skeet


But that seems messy.

It's not messy, it's correct.

  • If you want Distinct Programmers by FirstName and there are four Amy's, which one do you want?
  • If you Group programmers By FirstName and take the First one, then it is clear what you want to do in the case of four Amy's.

I can only use it here because I have a single key.

You can do a multiple key "distinct" with the same pattern:

return myList   .GroupBy( x => new { x.Url, x.Age } )   .Select( g => g.First() ); 
like image 41
Amy B Avatar answered Oct 20 '22 00:10

Amy B