Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ConcurrentHashMap parallelismThreshold

The ConcurrentHashMap got a couple new methods. I have two questions regarding them:

  1. Why aren't they declared in ConcurrentMap?
  2. What exactly does the parallelismThreshold mean or do?
like image 318
Benedikt Bünz Avatar asked Jun 30 '14 17:06

Benedikt Bünz


People also ask

What is difference between Synchronizedmap and ConcurrentHashMap?

ConcurrentHashMap allows performing concurrent read and write operation. Hence, performance is relatively better than the Synchronized Map. In Synchronized HashMap, multiple threads can not access the map concurrently. Hence, the performance is relatively less than the ConcurrentHashMap.

What is meant by ConcurrentHashMap?

ConcurrentHashMap: It allows concurrent access to the map. Part of the map called Segment (internal data structure) is only getting locked while adding or updating the map. So ConcurrentHashMap allows concurrent threads to read the value without locking at all. This data structure was introduced to improve performance.

What is the use of ConcurrentHashMap () in multithreading?

Key points of ConcurrentHashMap: ConcurrentHashMap class is thread-safe i.e. multiple threads can operate on a single object without any complications. At a time any number of threads are applicable for a read operation without locking the ConcurrentHashMap object which is not there in HashMap.

Can ConcurrentHashMap have null key?

ConcurrentHashMap does not allow null key or value. It will throw NullPointerException. ConcurrentHashMaps is not allowed null, to avoid ambiguities.


2 Answers

  1. These new methods seem to rely on implementation details specific to ConcurrentHashMap, but you would have to get an answer from the Java 8 authors to be sure. (they do browse SO)

  2. From the Javadoc of ConcurrentHashMap:

    These bulk operations accept a parallelismThreshold argument. Methods proceed sequentially if the current map size is estimated to be less than the given threshold. Using a value of Long.MAX_VALUE suppresses all parallelism. Using a value of 1 results in maximal parallelism by partitioning into enough subtasks to fully utilize the ForkJoinPool.commonPool() that is used for all parallel computations. Normally, you would initially choose one of these extreme values, and then measure performance of using in-between values that trade off overhead versus throughput.

like image 135
dkatzel Avatar answered Oct 25 '22 18:10

dkatzel


The parallelismThreshold determines whether bulk operations would be executed sequentially or in parallel. Running in parallel has some overhead, so it becomes useful only above some map size threshold.

ConcurrentHashMaps support a set of sequential and parallel bulk operations that, unlike most Stream methods, are designed to be safely, and often sensibly, applied even with maps that are being concurrently updated by other threads; for example, when computing a snapshot summary of the values in a shared registry. There are three kinds of operation, each with four forms, accepting functions with Keys, Values, Entries, and (Key, Value) arguments and/or return values. Because the elements of a ConcurrentHashMap are not ordered in any particular way, and may be processed in different orders in different parallel executions, the correctness of supplied functions should not depend on any ordering, or on any other objects or values that may transiently change while computation is in progress; and except for forEach actions, should ideally be side-effect-free. Bulk operations on Map.Entry objects do not support method setValue.

- forEach: Perform a given action on each element. A variant form applies a given
    transformation on each element before performing the action.
- search: Return the first available non-null result of applying a given function
    on each element; skipping further search when a result is found.
- reduce: Accumulate each element. The supplied reduction function cannot rely on
    ordering (more formally, it should be both associative and commutative).
    There are five variants:
    - Plain reductions. (There is not a form of this method for (key, value)
        function arguments since there is no corresponding return type.)
    - Mapped reductions that accumulate the results of a given function applied
        to each element.
    - Reductions to scalar doubles, longs, and ints, using a given basis value.

These bulk operations accept a parallelismThreshold argument. Methods proceed sequentially if the current map size is estimated to be less than the given threshold. Using a value of Long.MAX_VALUE suppresses all parallelism. Using a value of 1 results in maximal parallelism by partitioning into enough subtasks to fully utilize the ForkJoinPool.commonPool() that is used for all parallel computations. Normally, you would initially choose one of these extreme values, and then measure performance of using in-between values that trade off overhead versus throughput.

(Source)

like image 40
Eran Avatar answered Oct 25 '22 17:10

Eran