Using LINQ to find / delete duplicates

Question

I have a table that contains a bunch of duplicates. These are exact duplicates, minus the primary key column, which is an integer identity column.

Using EF and LINQ, how do I find the duplicates and delete them, leaving only one copy.

I found the duplicates and a count of each using SQL and SSMS. I'm just don't know where to start with LINQ.

Thanks!

Craig Stuntz · Accepted Answer

Off the top of my head (untested):

var q = from r in Context.Table
        group r by new { FieldA = r.FieldA, FieldB = r.FieldB, // ...
            into g
        where g.Count() > 1
        select g;
foreach (var g in q)
{
    var dupes = g.Skip(1).ToList();
    foreach (var record in dupes)
    {
        Context.DeleteObject(record);
    }
}
Context.SaveChanges();

Jorge Rodrigues dos Santos · Answer

Building on @Craig Stuntz's answer with a "one liner" alternative:

var duplicates = db.Table.GroupBy(a => new { a.FieldA, a.FieldB, ...})
                         .Where(a => a.Count() > 1)
                         .SelectMany(a => a.ToList());

foreach (var d in duplicates)
{
     db.DeleteObject(d);
}

db.SaveChanges();

Using LINQ to find / delete duplicates

Tags:

linq

entity-framework

DenaliHardtail

2 Answers

Craig Stuntz

Jorge Rodrigues dos Santos

Recent Activity

Donate For Us

Using LINQ to find / delete duplicates

Tags:

linq

entity-framework

DenaliHardtail

2 Answers

Craig Stuntz

Jorge Rodrigues dos Santos

Related questions

Recent Activity

Donate For Us