Is there some optimal value for ConcurrencyLevel beyond which ConcurrentHashMap's performance starts degrading?
If yes, what's that value, and what's the reason for performance degradation? (this question orginates from trying to find out any practical limitations that a ConcurrentHashMap may have).
The Javadoc offers pretty detailed guidance:
The allowed concurrency among update operations is guided by the optional
concurrencyLevel
constructor argument (default 16), which is used as a hint for internal sizing.The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because placement in hash tables is essentially random, the actual concurrency will vary. Ideally, you should choose a value to accommodate as many threads as will ever concurrently modify the table. Using a significantly higher value than you need can waste space and time, and a significantly lower value can lead to thread contention. But overestimates and underestimates within an order of magnitude do not usually have much noticeable impact. A value of one is appropriate when it is known that only one thread will modify and all others will only read.
To summarize: the optimal value depends on the number of expected concurrent updates. A value within an order of magnitude of that should work well. Values outside that range can be expected to lead to performance degradation.
You have to ask yourself two questions
The first question tells you the maximum number of threads which can access the map at once. You can have 10000 threads, but if you have only 4 cpus, at most 4 will be running at once.
The second question tells you the most any of those threads will be accessing the map AND doing something useful. You can optimise the map to do something useless (e.g. a micro-benchmark) but there is no point tuning for this IMHO. Say you have a useful program which uses the map a lot. It might be spending 90% of the time doing something else e.g. IO, accessing other maps, building keys or values, doing something with the values it gets from the map.
Say you spend 10% of the time accessing a map on a machine with 4 CPUs. This means on average you will be accessing the map in 0.4 threads on average. (Or one thread about 40% of the time) In this case a concurrency level of 1-4 is fine.
In any case, making the concurrency level higher than the number of cpus you have is likely to be unnecessary, even for a micro-benchmark.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With