What is the best way to find and mark duplicate objects in a Collection? Let us say we have a List persons and our duplicate strategy is based on exact match of first name and last name.
Is there a simple way of doing this with guava?
You don't need Guava to do this:
List<Person> people = ...
Map<Name, Person> peopleByName = new HashMap<>();
for (Person person : people) {
// Name is a simple value class with equality based on its fields
Name name = new Name(person.getFirstName(), person.getLastName());
Person firstPersonWithName = peopleByName.get(name);
if (firstPersonWithName == null) {
peopleByName.put(name, person);
} else {
// or whatever you do to mark a duplicate
person.setDuplicateOf(firstPersonWithName);
}
}
That said, you could use a Guava Table
instead of a Map
and avoid needing to create the Name
... use first name as the row keys and last name as column keys, say.
Another choice would be to use Multimaps.index
to index all the people in your list by name. Then for every list of people mapped to a particular name, the first person would be the first person with that name from your list and the others would be duplicates.
Why not try overriding .equals() in the person object. Then add a new field to each person object 'duplicateOf' or something.
Then just loop over the array, checking each person against the others. If the persons 'duplicateOf' field is null skip it. If .equals() returns true you can set the 'duplicateOf' field.
You can try to use Guava's TreeMultimap.
Create a new one TreeMultimap initializing it with a comparator for comparing you persons as you like: TreeMultimap.create(Comparator, Ordering.arbitrary())
Here is a unit test:
package org.test.guava;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import org.junit.Test;
import com.google.common.collect.Multimap;
import com.google.common.collect.Ordering;
import com.google.common.collect.TreeMultimap;
public class GuavaTest {
private static class Person {
private String name;
public Person(String name) {
this.name = name;
}
public String getName() {
return name;
}
@Override
public String toString() {
return "Person [name=" + name + "]";
}
}
@Test
public void test() throws Exception {
List<Person> persons = Arrays.asList(new Person("person1"), new Person("person2"), new Person("person1"));
Comparator<Person> comparator = new Comparator<Person>() {
public int compare(Person o1, Person o2) {
return o1.getName().compareTo(o2.getName());
}
};
Multimap<Person, Person> groups = TreeMultimap.create(comparator, Ordering.arbitrary());
for(Person person : persons) {
groups.put(person, person);
}
System.out.println(groups.asMap());
}
}
The class Person must implement the boolean equals(Object o)
.
Then you can find duplicates this way:
You have somewhere: Collection<Person> list;
Person[] persons = list.toArray();
Integer[] duplicateOf = new Integer[persons.length];
Arrays.fill(duplicateOf, -1);
// For all the values in the Collection
for (int i = 0; i < persons.length; i++) {
// Find the duplicate
for (int j = 0; j < persons.length; j++) {
if (persons[i].equals(persons[j]) && i != j)
duplicateOf[j] = i;
}
}
Now you have the Array duplicateOf
which you can read this way: The duplicate of element j
is at the index duplicateOf[j]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With