Let's say you have a class and you create a HashSet which can store this instances of this class. If you try to add instances which are equal, only one instance is kept in the collection, and that is fine.
However if you have two different instances in the HashSet, and you take one and make it an exact copy of the other (by copying the fields), the HashSet will then contain two duplicate instances.
Here is the code which demonstrates this:
public static void main(String[] args)
{
HashSet<GraphEdge> set = new HashSet<>();
GraphEdge edge1 = new GraphEdge(1, "a");
GraphEdge edge2 = new GraphEdge(2, "b");
GraphEdge edge3 = new GraphEdge(3, "c");
set.add(edge1);
set.add(edge2);
set.add(edge3);
edge2.setId(1);
edge2.setName("a");
for(GraphEdge edge: set)
{
System.out.println(edge.toString());
}
if(edge2.equals(edge1))
{
System.out.println("Equals");
}
else
{
System.out.println("Not Equals");
}
}
public class GraphEdge
{
private int id;
private String name;
//Constructor ...
//Getters & Setters...
public int hashCode()
{
int hash = 7;
hash = 47 * hash + this.id;
hash = 47 * hash + Objects.hashCode(this.name);
return hash;
}
public boolean equals(Object o)
{
if(o == this)
{
return true;
}
if(o instanceof GraphEdge)
{
GraphEdge anotherGraphEdge = (GraphEdge) o;
if(anotherGraphEdge.getId() == this.id && anotherGraphEdge.getName().equals(this.name))
{
return true;
}
}
return false;
}
}
The output from the above code:
1 a
1 a
3 c
Equals
Is there a way to force the HashSet to validate its contents so that possible duplicate entries created as in the above scenario get removed?
A possible solution could be to create a new HashSet and copy the contents from one hashset to another so that the new hashset won't contain duplicates however I don't like this solution.
The situation you describe is invalid. See the Javadoc: "The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set."
To add to @EJP's answer, what will happen in practice if you mutate objects in a HashSet
to make them duplicates (in the sense of the equals
/ hashcode
contract) is that the hash table data structure will break.
Depending on the exact details of the mutation, and the state of the hash table, one or both of the instances will become invisible to lookup (e.g. contains
and other operations). Either it is on the wrong hash chain, or because the other instance appears before it on the hash chain. And it is hard to predict which instance will be visible ... and whether it will remain visible.
If you iterate the set, both instances will still be present ... in violation of the Set
contract.
Of course, this is very broken from the application perspective.
You can avoid this problem by either:
From the perspective of correctness and robustness, the first option is clearly best.
Incidentally, it would be really difficult to "fix" this in a general way. There is no pervasive mechanism in Java for knowing ... or being notified ... that some element has changed. You can implement such a mechanism on a class by class basis, but it has to be coded explicitly (and it won't be cheap). Even if you did have such a mechanism, what would you do? Clearly one of the objects should now be removed from the set ... but which one?
You are correct and I don't think there is any way to protect against the case you discuss. All of collections which use hashing and equals are subject to this problem. The collection has no notification that the object has changed since it was added to the collection. I think the solution you outline is good.
If you are so concerned with this issue, perhaps you need to rethink your data structures. You could use immutable objects for instance. With immutable objects you would not have this problem.
HashSet
is not aware of its member's properties changing after the object has been added. If this is a problem for you, then you may want to consider making GraphEdge
immutable. For example:
GraphEdge edge4 = edge2.changeName("new_name");
In the case where GraphEdge
is immutable, changing a value result in returning a new instance rather changing the existing instance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With