When trying to remove objects wrapped in iterator Java 8 and Java 9+ behave differently. Consider the following example:
import java.util.Date;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
class Scratch {
public static void main(String[] args) {
Set<Date> dates = new HashSet<>();
dates.add(new Date(100));
dates.add(new Date(200));
for (Date date : dates) {
System.out.println("Initial "+date.getTime()+":"+date.hashCode());
date.setTime(date.getTime()+42);
System.out.println("Mutated "+date.getTime()+":"+date.hashCode()+"\n");
}
System.out.println("Size before remove iteration: "+dates.size());
Iterator<Date> iterator = dates.iterator();
while (iterator.hasNext()) {
Date date = iterator.next();
System.out.println("In loop: "+date.getTime()+":"+date.hashCode());
iterator.remove();
}
System.out.println("Size after remove iteration: "+dates.size());
}
}
After mutating objects inside the HashSet Java 8 refuses to remove them using iterator, check the size after removal loop. Java 8 output:
Initial 100:100
Mutated 142:142
Initial 200:200
Mutated 242:242
Size before remove iteration: 2
In loop: 142:142
In loop: 242:242
Size after remove iteration: 2
Java 9+ output is same as above but:
Size after remove iteration: 0
Why does this happen?
Java 8 applications use packages as a top-level component whereas Java 9 applications use modules as a top-level component. Each Java 9 module has only one module with one module descriptor whereas Java 8 package doesn't build multiple modules into a single module.
Set<Integer> test1 = new HashSet<Integer>(); test1. add(1); test1. add(2); test1. add(3); Set<Integer> test2 = new HashSet<Integer>(); test2.
Hash set and tree set both belong to the collection framework. HashSet is the implementation of the Set interface whereas Tree set implements sorted set. Tree set is backed by TreeMap while HashSet is backed by a hashmap.
The most straightforward way to convert a set to a list is by passing the set as an argument while creating the list. This calls the constructor and from there onwards the constructor takes care of the rest. Since the set has been converted to a list, the elements are now ordered.
Something changed about HashSet
between Java 8 and Java 9, but the specifics are not actually that interesting, because the way you are using the Set
is already specified to be wrong (emphasis mine):
Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set.
Since Date.equals()
depends on the time the Date
represents, you do exactly that.
And since you do that the behavior of the set is no longer specified.
Which means it can misbehave in any way/shape/form and still be a conforming implementation.
You could try to find out why Java 9 specifically behaves differently now (I don't know myself), but it doesn't change the fundamental issue that any JVM could at any point in time behave differently yet again if you use the set the wrong way.
Edit: out of curiosity I did investigate what exactly was different and found a relevant change: In both OpenJDK 8 and 9 HashSet
is implemented based on a HashMap
, so this all focuses on HashMap
.
In Java 8 the remove()
method of the relevant Iterator
implementation contains this line:
K key = p.key;
removeNode(hash(key), key, null, false, false);
This re-hashes (i.e. gets the current hash) of the key
(i.e. your Date
) and tries to remove that from the Map
. Since that new hash was never added in the first place (back when that key was added it had a different hash), this will not find a node and thus not remove anything.
In Java 9 that code looks like this instead:
removeNode(p.hash, p.key, null, false, false);
This will simply pass the previously calculated-and-remembered hash p.hash
to the removeNode
method which will thus be able to find and remove the node in question.
The changeset that introduced that change mentions this OpenJDK bug.
The comments in there (notably also by Doug Lea) seem to agree that "fixing" the behavior in the face of misused sets is not a goal, but that it could be faster to not re-compute the hash. In other words: that change was done for performance reasons and not for correctness reasons.
So to summarize and re-iterate: both of those behaviors are acceptable implementations, because by changing the equals()
behavior of your set entries, you've broken the contract already.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With