Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is CopyOnWriteArraySet useful to achieve thread-safe HashSet?

In Java, there is thread-safe version HashMap named ConcurrentHashMap and thread-safe version TreeMap named ConcurrentSkipListMap, but there is no ConcurrentHashSet for HashSet.

Instead, there are usually 4 ways to use thread-safe Set:

  1. Set<String> mySet = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
  2. Set<String> s = Collections.synchronizedSet(new HashSet<String>());
  3. ConcurrentSkipListSet<E>
  4. CopyOnWriteArraySet<E>

1 use keySet() of ConcurrentHashMap to achieve both Set and thread-safe.

2 use synchronized way, it seems this way is not recommended.

3 is based on ConcurrentSkipListMap and is widely used.

4 is based on CopyOnWriteArrayList, thus it shares the same basic properties of CopyOnWriteArrayList. Following is select from CopyOnWriteArraySet doc: http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CopyOnWriteArraySet.html

  • It is best suited for applications in which set sizes generally stay small, read-only operations vastly outnumber mutative operations, and you need to prevent interference among threads during traversal.
  • It is thread-safe.
  • Mutative operations (add, set, remove, etc.) are expensive since they usually entail copying the entire underlying array.
  • Iterators do not support the mutative remove operation.
  • Traversal via iterators is fast and cannot encounter interference from other threads.
  • Iterators rely on unchanging snapshots of the array at the time the iterators were constructed.

Since 1 and 3 are commonly used, why does CopyOnWriteArraySet exist? When is CopyOnWriteArraySet useful?

Added: CopyOnWriteArraySet is based on CopyOnWriteArrayList, and the contains operation in List data structure is O(n), while Set data structure is for high performance contains operation, could anybody explain this?

like image 264
coderz Avatar asked Mar 25 '15 07:03

coderz


People also ask

What is CopyOnWriteArraySet?

CopyOnWriteArraySet is a thread-safe. CopyOnWriteArraySet is to be used in Thread based environment where read operations are very frequent and update operations are rare. Iterator of CopyOnWriteArraySet will never throw ConcurrentModificationException.

Is HashSet thread-safe?

Thread Safe HashSet Using ConcurrentHashMap Factory Method Basically, this method returns an instance that respects the java. util. Set interface and allows the usage of standard methods like add(), contains(), etc.

How do you make a thread-safe set?

This method accepts an object of Set interface and, returns a synchronized (thread-safe) set backed by the specified set. This method accepts an object of the Map interface and, returns a synchronized (thread-safe) sorted map backed by the specified sorted map.

What is CopyOnWriteArrayList in Java?

CopyOnWriteArrayList is a thread-safe variant of ArrayList where operations which can change the ArrayList (add, update, set methods) creates a clone of the underlying array. CopyOnWriteArrayList is to be used in a Thread based environment where read operations are very frequent and update operations are rare.


2 Answers

It is useful when you have a small set of element for a thread safe collection.

One example is a Set of listeners. You need to ensure uniqueness and iterate over them efficiently.

BTW CopyOnWriteArraySet has the lowest overhead on a per reference basis. It can be as little as 1/6 the size of the other collections. This is particularly useful if you have a lot of them.

while Set data structure is for high performance contains operation, could anybody explain this?

COWAS is more efficient in terms of memory and it's contains is faster for small collections than the alternatives. What is "high performance" depends on the use case.

like image 193
Peter Lawrey Avatar answered Oct 04 '22 17:10

Peter Lawrey


Copy-on-write structures are functionally immutable.

Java at one point had a very poor story for providing immutable views on writeable structures such as sets. For example, if you had a set member, and you returned it publicly, the caller could just turn around and edit it, and therefore be editing your object's internal state! But what else can you do, copy the entire thing before returning from any public function? That would be pointlessly slow.

This was the story earlier in Java history. They relied almost exclusively on immutable objects (string is an example). Collections were an exception to this pattern, and were therefore problematic from an encapsulation perspective. When CopyOnWriteArraySet was added, unmodifiableCollection and unmodifiableSet did not yet exist (although unmodifiableCollection has largely solved the problem, I still find it a more cumbersome solution than what other languages offer, especially when using custom data structures). So this explains probably the largest motivation for creating CopyOnWriteArraySet in the first place. You could return a CopyOnWriteArraySet without fear of somebody else modifying your object's internal state, and without wasting time making unnecessary copies.

Copy-On-Write was a fad several years ago, but it is a notoriously inefficient idea for multi-threaded programming and is less efficient than other models. From the documentation you've posted, they've sped up iterating over it by creating thread-local snapshots, which means they are spending memory to compensate. So it's a perfectly okay class to use as long as your data is small... because the memory snapshots won't add up to much wasted memory.

like image 41
VoidStar Avatar answered Oct 04 '22 16:10

VoidStar