Using c# 3 and .Net Framework 3.5, I have a Person object
public Person
{
public int Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int SSN { get; set; }
}
and I've got a List of them:
List<Person> persons = GetPersons();
How can I get all the Person objects in persons where SSN is not unique in the list and remove them from the persons list and ideally add them to another list called "List<Person> dupes
"?
The original list might look something like this:
persons = new List<Person>();
persons.Add(new Person { Id = 1,
FirstName = "Chris",
LastName="Columbus",
SSN=111223333 }); // Is a dupe
persons.Add(new Person { Id = 1,
FirstName = "E.E.",
LastName="Cummings",
SSN=987654321 });
persons.Add(new Person { Id = 1,
FirstName = "John",
LastName="Steinbeck",
SSN=111223333 }); // Is a dupe
persons.Add(new Person { Id = 1,
FirstName = "Yogi",
LastName="Berra",
SSN=123456789 });
And the end result would have Cummings and Berra in the original persons list and would have Columbus and Steinbeck in a list called dupes.
Many thanks!
Using the indexOf() method In this method, what we do is that we compare the index of all the items of an array with the index of the first time that number occurs. If they don't match, that implies that the element is a duplicate. All such elements are returned in a separate array using the filter() method.
If you want to identify duplicates across the entire data set, then select the entire set. Navigate to the Home tab and select the Conditional Formatting button. In the Conditional Formatting menu, select Highlight Cells Rules. In the menu that pops up, select Duplicate Values.
This gets you the duplicated SSN:
var duplicatedSSN =
from p in persons
group p by p.SSN into g
where g.Count() > 1
select g.Key;
The duplicated list would be like:
var duplicated = persons.FindAll( p => duplicatedSSN.Contains(p.SSN) );
And then just iterate over the duplicates and remove them.
duplicated.ForEach( dup => persons.Remove(dup) );
Based on the recommendation by @gcores above.
If you want to add a single object of the duplicated SSN back to the list of persons, then add the following line:
IEnumerable<IGrouping<string, Person>> query = duplicated.GroupBy(d => d.SSN, d => d);
foreach (IGrouping<string, Person> duplicateGroup in query)
{
persons.Add(duplicateGroup .First());
}
My assumption here is that you may only want to remove duplicate values minus the original value that the duplicates derived from.
Thanks to gcores for getting me started down a correct path. Here's what I ended up doing:
var duplicatedSSN =
from p in persons
group p by p.SSN into g
where g.Count() > 1
select g.Key;
var duplicates = new List<Person>();
foreach (var dupeSSN in duplicatedSSN)
{
foreach (var person in persons.FindAll(p => p.SSN == dupeSSN))
duplicates.Add(person);
}
duplicates.ForEach(dup => persons.Remove(dup));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With