As we know, iterating over a concurrent collection is not thread safe by default, so one cannot use:
Set<E> set = Collections.synchronizedSet(new HashSet<>()); //fill with data for (E e : set) { process(e); }
This happens as data may be added during iteration, because there is no exclusive lock on set
.
This is describe in the javadoc of Collections.synchronizedSet
:
public static Set synchronizedSet(Set s)
Returns a synchronized (thread-safe) set backed by the specified set. In order to guarantee serial access, it is critical that all access to the backing set is accomplished through the returned set.
It is imperative that the user manually synchronize on the returned set when iterating over it:
Set s = Collections.synchronizedSet(new HashSet());
...synchronized (s) { Iterator i = s.iterator(); // Must be in the synchronized block while (i.hasNext()) foo(i.next()); }
Failure to follow this advice may result in non-deterministic behavior.
However, this does not apply to Set.forEach
, which inherits the default method forEach
from Iterable.forEach.
Now I looked into the source code, and here we can see that we have the following structure:
Collections.synchronizedSet()
.We get one:
public static <T> Set<T> synchronizedSet(Set<T> s) { return new SynchronizedSet<>(s); } ... static class SynchronizedSet<E> extends SynchronizedCollection<E> implements Set<E> { private static final long serialVersionUID = 487447009682186044L; SynchronizedSet(Set<E> s) { super(s); } SynchronizedSet(Set<E> s, Object mutex) { super(s, mutex); } public boolean equals(Object o) { if (this == o) return true; synchronized (mutex) {return c.equals(o);} } public int hashCode() { synchronized (mutex) {return c.hashCode();} } }
It extends SynchronizedCollection
, which has the following interesting methods next to the obvious ones:
// Override default methods in Collection @Override public void forEach(Consumer<? super E> consumer) { synchronized (mutex) {c.forEach(consumer);} } @Override public boolean removeIf(Predicate<? super E> filter) { synchronized (mutex) {return c.removeIf(filter);} } @Override public Spliterator<E> spliterator() { return c.spliterator(); // Must be manually synched by user! } @Override public Stream<E> stream() { return c.stream(); // Must be manually synched by user! } @Override public Stream<E> parallelStream() { return c.parallelStream(); // Must be manually synched by user! }
The mutex
used here is the same object as to which all operations of Collections.synchronizedSet
lock to.
Now we can, judging by the implementation say that it is thread safe to use Collections.synchronizedSet(...).forEach(...)
, but is it also thread safe by specification?
(Confusingly enough, Collections.synchronizedSet(...).stream().forEach(...)
is not thread safe by implementation, and the verdict of the specification seems to be unknown aswell.)
As you wrote, judging by implementation, forEach()
is thread-safe for the collections provided with JDK (see disclaimer below) as it requires monitor of mutex to be acquired to proceed.
Is it also thread safe by specification?
My opinion - no, and here is an explanation. Collections.synchronizedXXX()
javadoc, rewritten in short words, says - "all methods are thread-safe except for those used for iterating over it".
My other, although very subjective argument is what yshavit wrote - unless told/read that, consider API/class/whatever not thread-safe.
Now, let's take a closer look at the javadocs. I guess I may state that method forEach()
is used to iterate over it, so, following the advice from javadoc, we should consider it not thread-safe, although it is opposite to reality (implementation).
Anyway, I agree with yshavit's statement that the documentation should be updated as this is most likely a documentation, not implementation flaw. But, no one can say for sure except for JDK developers, see concerns below.
The last point I'd like to mention within this discussion - we can assume that custom collection can be wrapped with Collections.synchronizedXXX()
, and the implementation of forEach()
of this collection can be... can be anything. The collection might perform asynchronous processing of elements within the forEach()
method, spawn a thread for each element... it is bounded only by author's imagination, and synchronized(mutex) wrap cannot guarantee thread-safety for such cases. That particular issue might be the reason not to declare forEach()
method as thread-safe..
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With