Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Linq Find duplicates with multiple group by

Tags:

c#

linq

I have a list in c# as:

List<Data> uData = new List<uData>();

Where uData is populated from UI as:

{
   Id: 1,
   Name: "Smith",
   Input: "7,8",
   Output: "Output1",
   CreatedBy: "swallac",
   CreatedON: "12/01/2018"
},
{
   Id: 2,
   Name: "Austin",
   Input: "9,10",
   Output: "Output1",
   CreatedBy: "amanda",
   CreatedON: "12/03/2018"
},
{
   Id: 3,
   Name: "Smith",
   Input: "22,22",
   Output: "Output2",
   CreatedBy: "swallac",
   CreatedON: "12/01/2018"
},
{
   Id: 4,
   Name: "Smith",
   Input: "9,8",
   Output: "Output2",
   CreatedBy: "aaa",
   CreatedON: "12/01/2018"
},
{
   Id: 5,
   Name: "Peter",
   Input: "7,8",
   Output: "Output3",
   CreatedBy: "swallac",
   CreatedON: "12/02/2018"
}

What I want to do is search this list on "Output" key, and find out if there are in duplicates in the corresponding combination value of "Input" & CreatedBy key.

For example, in my above example list I have three Output: Output1,Output2, Output3. Now for lists with key of Output value as "Output1" & "Output3" the corresponding "Input" & "CreatedBy" key value is duplicate here. The value being "7,8"& "swallac" as combined value. This is what I want to highlight

For this I tried out the below query:

var myList = uData.GroupBy(l => l.Ouput)
                  .SelectMany(g => g.GroupBy(x => (x.Input, x.CreatedBy)).Where(x => x.Count() > 1))
                  .SelectMany(x => x);

This does not gives me any error but does not gives me desired result as it lists all the data. What am I missing here.

--Updated---

Earlier I wanted that the Input should not be repeated in one Output because of which I had the below query.

uData.GroupBy(l => l.Ouput)
    .Any(g => g.GroupBy(x => x.Input).Any(x => x.Count() > 1))

Now I want another query to check if the combination of Input and CreatedBy is repeated in the list.

I tried the above posted query and below query as per the suggestion:

uData.GroupBy(g=> new {g.CreatedBy,g.Input})
    .Where(w=>w.Count() > 1)

But this returns me all the list instead of the duplicate

Updated to add an example link:

https://dotnetfiddle.net/HWMYp6

I have created the example in above link.

In the example I want to mark the set with id 10 with output (output5) as the duplicate as such combination of Input and created by already existed before in id 1,2,3 (all of which belong to output1). So basically one combination of input and createby should not be repeated over another set. The reference key being Output. Sorry if my initial post was not very clear. I tried.

like image 331
Brenda Avatar asked Dec 13 '18 14:12

Brenda


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr. Stroustroupe.

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.

Is C programming hard?

C is more difficult to learn than JavaScript, but it's a valuable skill to have because most programming languages are actually implemented in C. This is because C is a “machine-level” language. So learning it will teach you how a computer works and will actually make learning new languages in the future easier.


2 Answers

it seems like you want to group by the "created by" and "input" only in which case a slight modification to your current query should suffice:

var result = uData.GroupBy(x => (x.Input, x.CreatedBy))
                  .Where(x => x.Count() > 1)
                  .SelectMany(x => x);

I've simply removed the GroupBy for the Output field.

  • GroupBy reads as "group by Input and CreatedBy"
  • Where reads as "retain the groups where there are two or more items"
  • SelectMany collapse the nested sequences into a IEnumerable<Data>

Update:

Given your edit, you're looking for:

var myList = uData.GroupBy(x => new {x.Input, x.CreatedBy})
                  .SelectMany(x => x.GroupBy(z => z.Output).Skip(1))
                  .SelectMany(x => x);
like image 86
Ousmane D. Avatar answered Nov 04 '22 03:11

Ousmane D.


You want to know for any Output if there is a matching Input & CreatedBy so you are grouping by Input & CreatedBy with just a count of the results being greater than 1.

var myList = uData.GroupBy(g=> new {g.CreatedBy,g.Input})
    .Where(w=>w.Count() > 1)
like image 37
Kevin LaBranche Avatar answered Nov 04 '22 05:11

Kevin LaBranche