For enterprise reasons I can't override hashCode
and I must use Java 6 (but I can use guava)
Whats the bests/simplest/quickest/most efficient/[insert indeterminate adjective equivalent to best] mechanism to remove duplicate beans from a Java collection?
A duplicate is defined by a subset of getters returning same values, e.g.
pojoA.getVal() == pojoB.getVal() && pojoA.getOtherVal() == pojoB.getOtherVal()
Wrap the objects of interest into your own class, and override its hashCode
/equals
to pay attention to a specific subset of attributes. Make a hash set of wrappers, then harvest the objects from the set to get a duplicate-free subset.
Here is an example:
class ActualData {
public String getAttr1();
public String getAttr2();
public String getAttr3();
public String getAttr4();
}
Let's say you want to pay attention to attributes 1, 2, and 4. Then you can make a wrapper like this:
class Wrapper {
private final ActualData data;
public ActualData getData() {
return data;
}
private final int hash;
public Wrapper(ActualData data) {
this.data = data;
this.has = ... // Compute hash based on data's attr1, 2, and 4
}
@Override
public int hashCode() {
return hashCode;
}
@Override
public boolean equals(Object obj) {
if (!(obj instanceof Wrapper)) return false;
Wrapper other = (Wrapper)obj;
return data.getAttr1().equals(other.getAttr1())
&& data.getAttr2().equals(other.getAttr2())
&& data.getAttr4().equals(other.getAttr4());
}
}
Now you can make a HashSet<Wrapper>
:
Set<Wrapper> set = new HashSet<>();
for (ActualData item : listWithDuplicates) {
if (!set.add(new Wrapper(item))) {
System.out.println("Item "+item+" was a duplicate");
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With