Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is iteration via Collections.synchronizedSet(...).forEach() guaranteed to be thread safe?

As we know, iterating over a concurrent collection is not thread safe by default, so one cannot use:

Set<E> set = Collections.synchronizedSet(new HashSet<>()); //fill with data for (E e : set) {     process(e); } 

This happens as data may be added during iteration, because there is no exclusive lock on set.

This is describe in the javadoc of Collections.synchronizedSet:

public static Set synchronizedSet(Set s)

Returns a synchronized (thread-safe) set backed by the specified set. In order to guarantee serial access, it is critical that all access to the backing set is accomplished through the returned set.

It is imperative that the user manually synchronize on the returned set when iterating over it:

Set s = Collections.synchronizedSet(new HashSet());
...
synchronized (s) { Iterator i = s.iterator(); // Must be in the synchronized block while (i.hasNext()) foo(i.next()); }

Failure to follow this advice may result in non-deterministic behavior.

However, this does not apply to Set.forEach, which inherits the default method forEach from Iterable.forEach.

Now I looked into the source code, and here we can see that we have the following structure:

  1. We ask for a Collections.synchronizedSet().
  2. We get one:

    public static <T> Set<T> synchronizedSet(Set<T> s) {     return new SynchronizedSet<>(s); }  ...  static class SynchronizedSet<E>       extends SynchronizedCollection<E>       implements Set<E> {     private static final long serialVersionUID = 487447009682186044L;      SynchronizedSet(Set<E> s) {         super(s);     }     SynchronizedSet(Set<E> s, Object mutex) {         super(s, mutex);     }      public boolean equals(Object o) {         if (this == o)             return true;         synchronized (mutex) {return c.equals(o);}     }     public int hashCode() {         synchronized (mutex) {return c.hashCode();}     } } 
  3. It extends SynchronizedCollection, which has the following interesting methods next to the obvious ones:

    // Override default methods in Collection @Override public void forEach(Consumer<? super E> consumer) {     synchronized (mutex) {c.forEach(consumer);} } @Override public boolean removeIf(Predicate<? super E> filter) {     synchronized (mutex) {return c.removeIf(filter);} } @Override public Spliterator<E> spliterator() {     return c.spliterator(); // Must be manually synched by user! } @Override public Stream<E> stream() {     return c.stream(); // Must be manually synched by user! } @Override public Stream<E> parallelStream() {     return c.parallelStream(); // Must be manually synched by user! } 

The mutex used here is the same object as to which all operations of Collections.synchronizedSet lock to.

Now we can, judging by the implementation say that it is thread safe to use Collections.synchronizedSet(...).forEach(...), but is it also thread safe by specification?

(Confusingly enough, Collections.synchronizedSet(...).stream().forEach(...) is not thread safe by implementation, and the verdict of the specification seems to be unknown aswell.)

like image 917
skiwi Avatar asked May 03 '14 21:05

skiwi


1 Answers

As you wrote, judging by implementation, forEach() is thread-safe for the collections provided with JDK (see disclaimer below) as it requires monitor of mutex to be acquired to proceed.

Is it also thread safe by specification?

My opinion - no, and here is an explanation. Collections.synchronizedXXX() javadoc, rewritten in short words, says - "all methods are thread-safe except for those used for iterating over it".

My other, although very subjective argument is what yshavit wrote - unless told/read that, consider API/class/whatever not thread-safe.

Now, let's take a closer look at the javadocs. I guess I may state that method forEach() is used to iterate over it, so, following the advice from javadoc, we should consider it not thread-safe, although it is opposite to reality (implementation).

Anyway, I agree with yshavit's statement that the documentation should be updated as this is most likely a documentation, not implementation flaw. But, no one can say for sure except for JDK developers, see concerns below.

The last point I'd like to mention within this discussion - we can assume that custom collection can be wrapped with Collections.synchronizedXXX(), and the implementation of forEach() of this collection can be... can be anything. The collection might perform asynchronous processing of elements within the forEach() method, spawn a thread for each element... it is bounded only by author's imagination, and synchronized(mutex) wrap cannot guarantee thread-safety for such cases. That particular issue might be the reason not to declare forEach() method as thread-safe..

like image 117
Alexey Malev Avatar answered Sep 24 '22 19:09

Alexey Malev