Generally, concurrent collections are safe to iterate; according to Javadoc: 'Iterators are weakly consistent, returning elements reflecting the state of the set at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations.' However, consider this:
import java.util.Random;
import java.util.TreeSet;
import java.util.concurrent.ConcurrentSkipListSet;
public class ConcurrencyProblem {
private static volatile boolean modifierIsAlive = true;
public static void main(String[] args) {
final ConcurrentSkipListSet<Integer> concurrentSet = new ConcurrentSkipListSet<>();
Thread modifier = new Thread() {
private final Random randomGenerator = new Random();
public void run() {
while (modifierIsAlive) {
concurrentSet.add(randomGenerator.nextInt(1000));
concurrentSet.remove(randomGenerator.nextInt(1000));
}
};
};
modifier.start();
int sum = 0;
while (modifierIsAlive) {
try {
TreeSet<Integer> sortedCopy = new TreeSet<>(concurrentSet);
// make sure the copy operation is not eliminated by the compiler
sum += sortedCopy.size();
} catch (RuntimeException rte) {
modifierIsAlive = false;
rte.printStackTrace();
}
}
System.out.println("Dummy output: " + sum);
}
}
The output is
java.util.NoSuchElementException
at java.util.concurrent.ConcurrentSkipListMap$Iter.advance(ConcurrentSkipListMap.java:2299)
at java.util.concurrent.ConcurrentSkipListMap$KeyIterator.next(ConcurrentSkipListMap.java:2334)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2559)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2547)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2579)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2504)
at java.util.TreeMap.addAllForTreeSet(TreeMap.java:2462)
at java.util.TreeSet.addAll(TreeSet.java:308)
at java.util.TreeSet.<init>(TreeSet.java:172)
at mtbug.ConcurrencyProblem.main(ConcurrencyProblem.java:27)
Dummy output: 44910
I'm wondering if this is a bug or a feature; we did not get a ConcurrentModificationException, but still, having to care about iteration (falling back to synchronized blocks or otherwise) kind of defeats the purpose of ConcurrentSkipListSet/Map. I've been able to reproduce this both with Java 7 and 8 (currently, 8u72 on my Linux box).
The ConcurrentSkipListSet class in Java is a part of the Java Collection Framework and implements the Collection interface and the AbstractSet class. It provides a scalable and concurrent version of NavigableSet in Java. The implementation of ConcurrentSkipListSet is based on ConcurrentSkipListMap.
How do you fix Java's ConcurrentModificationException? There are two basic approaches: Do not make any changes to a collection while an Iterator loops through it. If you can't stop the underlying collection from being modified during iteration, create a clone of the target data structure and iterate through the clone.
TreeSet cannot contain duplicate elements. The elements in a TreeSet are sorted as per their natural ordering, or based on a custom Comparator that is supplied at the time of creation of the TreeSet. TreeSet cannot contain null value. TreeSet internally uses a TreeMap to store elements.
TreeSet implements the SortedSet interface. So, duplicate values are not allowed and will be leftovers. Objects in a TreeSet are stored in a sorted and ascending order. TreeSet does not preserve the insertion order of elements but elements are sorted by keys.
As far as I can understand from browsing the sources, the problem with TreeSet
is that it calls size()
before iterating and then uses it instead of calling hasNext()
. This may be a bug, but I think it's just a consequence of red-black trees being complicated structures requiring careful balancing, and therefore knowing size in advance is needed to properly balance it in linear time during creation.
You may circumvent this by iterating manually and adding elements to the TreeSet
, but this will lead to n log n
complexity, which could be the reason why TreeSet
's constructor doesn't do it that way (its API spec guarantees linear time). Of course it could still call hasNext()
as it builds the tree, but then some additional actions may be required after the construction is finished to rebalance the tree, which could lead to amortized linear complexity. But red-black trees are a mess as they are, and that kind of hack would make the implementation even messier.
Still, I think it's very confusing and should probably be documented somewhere in the API docs, but I'm not sure where exactly. Probably in the part where they explain what weakly consistent iterators are. Specifically, it should be mentioned that some library classes rely on the returned size and therefore may throw NoSuchElementException
. Mentioning specific classes would also help.
I'm actually starting to lean towards this being a bug in TreeSet
/TreeMap
(update, it is). The issue, as Sergey alludes, is that TreeMap
caches the result of ConcurrentSkipListSet.size()
before reading its elements.
TreeSet.addAll()
callsTreeMap.addAllForTreeSet()
and passes the collection's current size and potentially concurrent Iterator
toTreeMap.buildFromSorted()
which ultimately calls Iterator.next()
size
-times.In other words, it assumes the Collection
it is passed will not be modified during construction, which is an erroneous assumption.
Note that even if buildFromSorted()
did call Iterator.hasNext()
its only option at that point would be to fail, since the backing data structure was modified mid-construction.
Looking at other collections that could potentially have some issue copying concurrent structures, including ArrayList
, LinkedList
, and CopyOnWriteArrayList
(most other collections I looked at simply for-each over the elements), explicitly copy the provided collection to an array before doing any actual work in order to avoid this exact issue. I think TreeSet
and TreeMap
should be doing the same thing.
We actually don't have to accept O(n log n) performance due to this bug, but it's going to be a hack. We can't simply copy the values into an array or other data structure, because then inserting into the TreeSet
won't be linear time. But we can lie to TreeSet
by claiming the copy is a SortedSet
.
public static class IterateOnlySortedSet<E>
extends AbstractSet<E> implements SortedSet<E> {
private final ArrayList<E> elements;
private final Comparator<? super E> comparator;
public IterateOnlySortedSet(SortedSet<E> source) {
elements = new ArrayList<>(source);
comparator = source.comparator();
}
@Override
public Iterator<E> iterator() {
return elements.iterator();
}
@Override
public int size() {
return elements.size();
}
@Override
public Comparator<? super E> comparator() {
return comparator;
}
// remaining methods simply throw UnsupportedOperationException
}
Changing your TreeSet
construction line to:
TreeSet<Integer> sortedCopy = new TreeSet<>(new IterateOnlySortedSet<>(concurrentSet));
Now succeeds.
Nice find :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With