I have a table with many duplicate records. How can I use a C# Linq query to group by multiple fields but also get the ID of the distinct record.
For example: I have a Store table that looks like this
Store
-----
ID, StoreName, Address1
1, CVS, 123 Main Street
2, CVS, 123 Main Street
3, CVS, 456 Main Street
I want to group by storeName then by address1, but also get the ID of the first distinct (storeName and address1) record.
I only want these two records:
1, CVS, 123 Main Street
3, CVS, 456 Main Street
I tried using an anonymous object but can't figure out how to get the ID without also grouping by ID.
When you use GroupBy
in LINQ, you get an enumerable of a class which implements interface IGrouping<TKey,TElement>
- the TElement
representing the individual, and original, elements that have been grouped.
Therefore you would group like so
var grouped = myList.GroupBy(x => new {x.StoreName,x.Address1});
Which would give you an enumerable list of IGrouping<anonymous,Store>
(Assuming the original object was called Store
). The anonymous
is an anonymous object with 2 properties StoreName
and Address1
.
Now, each item in the enumerable is itself enumerable of Store
, and you can treat that exactly as you would any other enumerable - you can take the First().ID
if you wish, and you could project the whole lot back out to an enumerable if thats what you want
var result = myList.GroupBy(x => new {x.StoreName,x.Address1});
.Select(g => new {
g.Key.StoreName,
g.Key.Address1,
FirstID = g.First().ID
});
Something like this should work. You must apply aggregation on field that is not the part of grouping:
var result = list.GroupBy(c => new { c.StoreName, c.Address1 }).
Select(c => new
{
ID = c.Min(i => i.ID),
c.Key.StoreName,
c.Key.Address1
}).ToList();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With