Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different types of thread-safe Sets in Java

People also ask

What are Java thread-safe types?

Atomic Objects It's also possible to achieve thread-safety using the set of atomic classes that Java provides, including AtomicInteger, AtomicLong, AtomicBoolean and AtomicReference. Atomic classes allow us to perform atomic operations, which are thread-safe, without using synchronization.

Which set is thread-safe in Java?

Even before Java 8, there is a class called CopyOnWriteArraySet which allows you to create a thread-safe set in Java.

Is HashSet thread-safe?

Thread Safe HashSet Using ConcurrentHashMap Factory Method Basically, this method returns an instance that respects the java. util. Set interface and allows the usage of standard methods like add(), contains(), etc.

What are thread-safe methods?

If a method (instance or static) only references variables scoped within that method then it is thread safe because each thread has its own stack: In this instance, multiple threads could call ThreadSafeMethod concurrently without issue.


  1. The CopyOnWriteArraySet is a quite simple implementation - it basically has a list of elements in an array, and when changing the list, it copies the array. Iterations and other accesses which are running at this time continue with the old array, avoiding necessity of synchronization between readers and writers (though writing itself needs to be synchronized). The normally fast set operations (especially contains()) are quite slow here, as the arrays will be searched in linear time.

    Use this only for really small sets which will be read (iterated) often and changed seldom. (Swing's listener-sets would be an example, but these are not really sets, and should be only used from the EDT anyway.)

  2. Collections.synchronizedSet will simply wrap a synchronized-block around each method of the original set. You should not access the original set directly. This means that no two methods of the set can be executed concurrently (one will block until the other finishes) - this is thread-safe, but you will not have concurrency if multiple threads are using the set. If you use the iterator, you usually still need to synchronize externally to avoid ConcurrentModificationExceptions when modifying the set between iterator calls. The performance will be like the performance of the original set (but with some synchronization overhead, and blocking if used concurrently).

    Use this if you only have low concurrency, and want to be sure all changes are immediately visible to the other threads.

  3. ConcurrentSkipListSet is the concurrent SortedSet implementation, with most basic operations in O(log n). It allows concurrent adding/removing and reading/iteration, where iteration may or may not tell about changes since the iterator was created. The bulk operations are simply multiple single calls, and not done atomically – other threads may observe only some of them.

    Obviously you can use this only if you have some total order on your elements. This looks like an ideal candidate for high-concurrency situations, for not-too-large sets (because of the O(log n)).

  4. For the ConcurrentHashMap (and the Set derived from it): Here most basic options are (on average, if you have a good and fast hashCode()) in O(1) (but might degenerate to O(n) when many keys have the same hash code), like for HashMap/HashSet. There is a limited concurrency for writing (the table is partitioned, and write access will be synchronized on the needed partition), while read access is fully concurrent to itself and the writing threads (but might not yet see the results of the changes currently being written). The iterator may or may not see changes since it was created, and bulk operations are not atomic. Resizing is slow (as for HashMap/HashSet), thus you should try to avoid this by estimating the needed size on creation (and using about 1/3 more of that, as it resizes when 3/4 full).

    Use this when you have large sets, a good (and fast) hash function and can estimate the set size and needed concurrency before creating the map.

  5. Are there other concurrent map implementations one could use here?


It is possible to combine the contains() performance of HashSet with the concurrency-related properties of CopyOnWriteArraySet by using the AtomicReference<Set> and replacing the whole set on each modification.

The implementation sketch:

public abstract class CopyOnWriteSet<E> implements Set<E> {

    private final AtomicReference<Set<E>> ref;

    protected CopyOnWriteSet( Collection<? extends E> c ) {
        ref = new AtomicReference<Set<E>>( new HashSet<E>( c ) );
    }

    @Override
    public boolean contains( Object o ) {
        return ref.get().contains( o );
    }

    @Override
    public boolean add( E e ) {
        while ( true ) {
            Set<E> current = ref.get();
            if ( current.contains( e ) ) {
                return false;
            }
            Set<E> modified = new HashSet<E>( current );
            modified.add( e );
            if ( ref.compareAndSet( current, modified ) ) {
                return true;
            }
        }
    }

    @Override
    public boolean remove( Object o ) {
        while ( true ) {
            Set<E> current = ref.get();
            if ( !current.contains( o ) ) {
                return false;
            }
            Set<E> modified = new HashSet<E>( current );
            modified.remove( o );
            if ( ref.compareAndSet( current, modified ) ) {
                return true;
            }
        }
    }

}

If the Javadocs don't help, you probably should just find a book or article to read about data structures. At a glance:

  • CopyOnWriteArraySet makes a new copy of the underlying array every time you mutate the collection, so writes are slow and Iterators are fast and consistent.
  • Collections.synchronizedSet() uses old-school synchronized method calls to make a Set threadsafe. This would be a low-performing version.
  • ConcurrentSkipListSet offers performant writes with inconsistent batch operations (addAll, removeAll, etc.) and Iterators.
  • Collections.newSetFromMap(new ConcurrentHashMap()) has the semantics of ConcurrentHashMap, which I believe isn't necessarily optimized for reads or writes, but like ConcurrentSkipListSet, has inconsistent batch operations.