Let me start by saying I've read these questions: 1 & 2, and I understand that I can write the code to find duplicates in my List, but my problem is I want to update the original list not just query and print the duplicates.
I know I can't update the collection the query returns as it's not a view, it's an anonymous type IEnumerable<T>
.
I want to be able to find duplicates in my list, and mark a property I've created called State
which is used later in the application.
Has anyone ran into this problem and can you point me in the right direction?
p.s. The approach I'm using ATM is a bubble sort type loop to go through the list item by item and compare key fields. Obviously this isn't the fastest method.
EDIT:
In order to consider an item in the list a "duplicate", there are three fields which must match. We'll call them Field1, Field2, and Field3
I have an overloaded Equals() method on the base class which compares these fields.
The only time I skip an object in my MarkDuplicates()
method is if the objects state is UNKNOWN
or ERROR
, otherwise, I test it.
Let me know if you need more details.
Thanks again!
For each element in the stream, count the frequency of each element, using Collections. frequency() method. Then for each element in the collection list, if the frequency of any element is more than one, then this element is a duplicate element.
I think the easiest way is to start by writing an extension method which find's duplicates in a list of objects. Since you're objects use .Equals() they can be compared in most common collections.
public static IEnumerable<T> FindDuplicates<T>(this IEnumerable<T> enumerable) {
var hashset = new HashSet<T>();
foreach ( var cur in enumerable ) {
if ( !hashset.Add(cur) ) {
yield return cur;
}
}
}
Now it should be pretty easy to update your collection for duplicates. For instance
List<SomeType> list = GetTheList();
list
.FindDuplicates()
.ToList()
.ForEach(x => x.State = "DUPLICATE");
If you already have a ForEach extentsion method defined in your code, you can avoid the .ToList.
Your objects have some sort of state property. You're presumably finding duplicates based on another property or set of properties. Why not:
List<obj> keys = new List<object>();
foreach (MyObject obj in myList)
{
if (keys.Contains(obj.keyProperty))
obj.state = "something indicating a duplicate here";
else
keys.add(obj.keyProperty)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With