Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicates based on a condition using Linq

Tags:

c#

linq

generics

My object is in this form

List<SignUp>

class SignUp
{
  public int Id { get ; set;}
  public int VersionId { get ; set;}
  public int PersonId{ get ; set;}
  public DateTime? SignUpDate { get ; set;}
}

People signup to a version of a document. Some versions never get archived and they have to resign every year. so I end up with records like

SignUp s = new SignUp { Id = 1, VersionId = 1, PersonId = 5}
SignUp s2 = new SignUp { Id = 2, VersionId = 2, PersonId = 5}
SignUp s3 = new SignUp { Id = 3, VersionId = 1, PersonId = 5}

No this list which has s, s2, s3 has 2 duplicates on personId, versionId combination which are s & s3. only thing is s3 has a higher Id than s. Hence I want to eliminate s and just display s2, s3 (s is an older version and I ignore it)

How can this be achieved using a linq query if possible?

like image 969
chugh97 Avatar asked Oct 28 '25 16:10

chugh97


1 Answers

How about:

List<SignUp> signups = ...

var filteredSignups = from signup in signups
                      group signup by new { signup.PersonId, signup.VersionId }
                                      into pvIdGroup
                      select pvIdGroup.OrderBy(groupedSignUp => groupedSignUp.Id)
                                      .Last();

The idea is to group the items by the two properties and then pick the "best" item from each group.

If you don't want the inefficiency of sorting the items within each group, consider using an O(n) MaxBy method, such as the one from morelinq.

Then the select becomes:

select pvIdGroup.MaxBy(groupedSignUp => groupedSignUp.Id)
like image 138
Ani Avatar answered Oct 31 '25 07:10

Ani



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!