I have a List of type string in a .NET 3.5 project. The list has thousands of strings in it, but for the sake of brevity we're going to say that it just has 5 strings in it.
List<string> lstStr = new List<string>() {
"Apple", "Banana", "Coconut", "Coconut", "Orange"};
Assume that the list is sorted (as you can tell above). What I need is a LINQ query that will remove all strings that are not duplicates. So the result would leave me with a list that only contains the two "Coconut" strings.
Is this possible to do with a LINQ query? If it is not then I'll have to resort to some complex for loops, which I can do, but I didn't want to unless I had to.
To select duplicate values, you need to create groups of rows with the same values and then select the groups with counts greater than one. You can achieve that by using GROUP BY and a HAVING clause.
One way to find duplicate records from the table is the GROUP BY statement. The GROUP BY statement in SQL is used to arrange identical data into groups with the help of some functions. i.e if a particular column has the same values in different rows then it will arrange these rows in a group.
here is code for finding duplicates form string arrya
int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
foreach (var d in duplicates)
Console.WriteLine(d);
var dupes = lstStr.Where(x => lstStr.Sum(y => y==x ? 1 : 0) > 1);
OR
var dupes = lstStr.Where((x,i) => ( (i > 0 && x==lstStr[i-1])
|| (i < lstStr.Count-1 && x==lstStr[i+1]));
Note that the first one enumerates the list for every element which takes O(n²) time (but doesn't assume a sorted list). The second one is O(n) (and assumes a sorted list).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With