Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using LINQ to find duplicates across multiple properties

Given a class with the following definition:

public class MyTestClass
{
    public int ValueA { get; set; }
    public int ValueB { get; set; }
}

How can duplicate values be found in a MyTestClass[] array?

For example,

MyTestClass[] items = new MyTestClass[3];
items[0] = new MyTestClass { ValueA = 1, ValueB = 1 };
items[1] = new MyTestClass { ValueA = 0, ValueB = 1 };
items[2] = new MyTestClass { ValueA = 1, ValueB = 1 };

Contains a duplicate as there are two MyTestClass objects where ValueA and ValueB both = 1

like image 824
CatBusStop Avatar asked Apr 08 '11 14:04

CatBusStop


People also ask

How do I find duplicate records in Linq?

To find the duplicate values only:var duplicates = list. GroupBy(x => x. Key). Where(g => g.

Does Linq Union remove duplicates?

Linq, acts upon 2 collections. It returns a new collection that contains the elements that are found. Union removes duplicates. So this method can be thought of as two actions: it combines the two collections and then uses Distinct() on them, removing duplicate elements.

Can a list have duplicate values in C#?

Using Enumerable. We can use the Enumerable. GroupBy() method to group the elements based on their value, then filters out the groups that appear only once, leaving them out with duplicates keys.

How do I remove duplicate values in Linq?

How to remove the duplicates in the list using linq? You can also do var set = new HashSet<int>(); var uniques = items. Where(x => set. Add(x.Id)); .


3 Answers

You can find your duplicates by grouping your elements by ValueA and ValueB. Do a count on them afterwards and you will find which ones are duplicates.

This is how you would isolate the dupes :

var duplicates = items.GroupBy(i => new {i.ValueA, i.ValueB})
  .Where(g => g.Count() > 1)
  .Select(g => g.Key);
like image 66
Hugo Migneron Avatar answered Oct 09 '22 10:10

Hugo Migneron


You could just use Jon Skeet's DistinctBy and Except together to find duplicates. See this Response for his explanation of DistinctBy.

MyTestClass[] items = new MyTestClass[3];
items[0] = new MyTestClass { ValueA = 1, ValueB = 1 };
items[1] = new MyTestClass { ValueA = 0, ValueB = 1 };
items[2] = new MyTestClass { ValueA = 1, ValueB = 1 };

MyTestClass [] distinctItems = items.DistinctBy(p => new {p.ValueA, p.ValueB}).ToArray();
MyTestClass [] duplicates = items.Except(distinctItems).ToArray();

It will only return one item and not both duplicates however.

like image 5
DaveH Avatar answered Oct 09 '22 10:10

DaveH


MyTestClass should implement the Equals method.

public bool Equals(MyTestClass x, MyTestClass y)
{
    if (Object.ReferenceEquals(x, y)) return true;

    if (Object.ReferenceEquals(x, null) ||
        Object.ReferenceEquals(y, null))
            return false;

        return x.ValueA == y.ValueA && y.ValueB == y.ValueB;
}

Here you have a good article about it.

After that you can get a "clean" list of MyTestClass with "Distinct" method.

like image 1
zapico Avatar answered Oct 09 '22 10:10

zapico