I've got a List<string>
that contains duplicates and I need to find the indexes of each.
What is the most elegant, efficient way other than looping through all the items. I'm on .NET 4.0 so LINQ is an option. I've done tons of searching and connect find anything.
Sample data:
var data = new List<string>{"fname", "lname", "home", "home", "company"}();
I need to get the indexes of "home".
Duplicate indexes are those that exactly match the Key and Included columns. That's easy. Possible duplicate indexes are those that very closely match Key/Included columns.
function checkIfArrayIsUnique(myArray) { for (var i = 0; i < myArray. length; i++) { for (var j = i+1; j < myArray. length; j++) { if (myArray[i] == myArray[j]) { return true; // means there are duplicate values } } } return false; // means there are no duplicate values. }
You can create an object from each item containing it's index, then group on the value and filter out the groups containing more than one object. Now you have a grouping list with objects containing the text and their original index:
var duplicates = data
.Select((t,i) => new { Index = i, Text = t })
.GroupBy(g => g.Text)
.Where(g => g.Count() > 1);
using System;
using System.Collections.Generic;
class Program
{
static void Main(string[] args)
{
var data = new List<string> { "fname", "lname", "home", "home", "company" };
foreach (var duplicate in FindDuplicates(data))
{
Console.WriteLine("Duplicate: {0} at index {1}", duplicate.Item1, duplicate.Item2);
}
}
public static IEnumerable<Tuple<T, int>> FindDuplicates<T>(IEnumerable<T> data)
{
var hashSet = new HashSet<T>();
int index = 0;
foreach (var item in data)
{
if (hashSet.Contains(item))
{
yield return Tuple.Create(item, index);
}
else
{
hashSet.Add(item);
}
index++;
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With