I am wondering about the parameters for constructing a ConcurrentHashMap
:
initialCapacity
is 16 by default (understood).loadFactor
is 0.75 by default.concurrencyLevel
is 16 by default.My questions are:
loadFactor
up or down?concurrencyLevel
up or down?Additionally:
Thank you!
ConcurrentHashMap() Creates a new, empty map with a default initial capacity (16), load factor (0.75) and concurrencyLevel (16).
A ConcurrentHashMap has internal final class called Segment so we can say that ConcurrentHashMap is internally divided in segments of size 32, so at max 32 threads can work at a time.
A good approach can be having initialization like this: ConcurrentHashMap<String, Integer> instance = new ConcurrentHashMap<String, Integer>(16, 0.9f, 1); An initial capacity of 16 ensures a reasonably good number of elements before resizing happens.
ConcurrentHashMap is thread-safe therefore multiple threads can operate on a single object without any problem. In ConcurrentHashMap, the Object is divided into a number of segments according to the concurrency level. By default, it allows 16 thread to read and write from the Map without any synchronization.
The short answer: set "initial capacity" to roughly how many mappings you expect to put in the map, and leave the other parameters at their default.
Long answer:
load factor is the ratio between the number of "buckets" in the map and the number of expected elements;
0.75 is usually a reasonable compromise-- as I recall, it means that with a good hash function, on average we expect about 1.6 redirects to find an element in the map (or around that figure);
changing the load factor changes the compromise between more redirects to find an element but less wasted space-- put 0.75 is really usually a good value;
in principle, set ConcurrencyLevel to the number of concurrent threads you expect to have modifying the map, although overestimating this doesn't appear to have a bad effect other than wasting memory (I wrote a little on ConcurrentHashMap performance a while ago in case you're interested)
Informally, your hash function should essentially aim to have as much "randomness" in the bits as possible. Or more strictly, the hash code for a given element should give each bit a roughly 50% chance of being set. It's actually easier to illustrate this with an example: again, you may be interested in some stuff I wrote about how the String hash function works and associated hash function guidelines. Feedback is obvioulsy welcome on any of this stuff.
One thing I also mention at some point is that you don't have to be too paranoid in practice: if your hash function produces a "reasonable" amount of randomness in some of the bits, then it will often be OK. In the worst case, sticking representative pieces of data into a string and taking the hash code of the string actually doesn't work so badly.
Load Factor is primarily related to the quality of the hash function. The closer to zero the load factor the less likely there are to be collisions even if the hash function isn't so great. The trade off is that the memory footprint is larger. In other words, the HashMap isn't distributing the entries in seperate buckets for each seperate hashcode, it is grouping them by a proximity, so the more buckets it has, the more spread out the distribution, the less likely that there are collisions.
So the bottom line is you fiddle with load factor to improve lookup time or reduce memory, according to your needs and the objects you are storing in the Map.
ConcurrencyLevel really depends on your application. If you only have two or three threads running in the application, there you go. If you are an application server with an arbitrary number of threads, then you need to understand what your load capacity is and what point you want to optimize for.
A good quality hashcode implementation provides as wide a distribution across potential values of the object as possible with the least number of collisions, while honoring the contract. In other words, it allows the HashMap (or Set as the case may be) to distribute the objects into separate buckets making lookups faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With