Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance for HashMap when Key is Guaranteed Unique

If the keys I wish to use are guaranteed to be unique (or at least the assumption can be made that the keys are unique), does using a 'vanilla' ConcurrentHashMap provide the best performance, or does a hashing function or put method need to be modified to avoid needless hashing?

Also, does a numeric key have any performance benefit over a non-numeric key (such as a String or POJO with a proper hashing function)?

like image 968
D. Moore Avatar asked Dec 28 '22 17:12

D. Moore


2 Answers

As already mentioned in the comments, if you don't need the thread-safe aspects then don't use ConcurrentHashMap.

If you want the absolute best performance consider interning your keys and using an IdentityHashMap. This avoids calculating the hash of the object (and, as mentioned in the comments, negates the need for equals to be evaluated) and instead assumes that the reference itself is the hash.

Note obviously that you've got to make sure that two instances of the same key are the same object (e.g. you have to ensure reference equality, not just object equality). Interning all your keys is one approach for achieving this.

Implementation note: This is a simple linear-probe hash table, as described for example in texts by Sedgewick and Knuth. The array alternates holding keys and values. (This has better locality for large tables than does using separate arrays.) For many JRE implementations and operation mixes, this class will yield better performance than HashMap (which uses chaining rather than linear-probing).

If you know all the keys, perhaps you could also consider perfect hashing? Or map to a simple array structure?

like image 94
Jeff Foster Avatar answered Jan 11 '23 19:01

Jeff Foster


ConcurrentHashMap is the most expensive of the HashMap implementations, this is becuase it is thread safe.

All Maps must have unique keys so this is a given.

Using numbers has a performance advantage if you use a collection which supports primtives like TLongHashMap, however you may be able to go much faster using a custom hash map.

From http://vanillajava.blogspot.com/2011/07/low-gc-in-java-using-primitives.html

Test                                    Performance Memory used
Use Integer wrappers and HashMap        71 - 134 (ns)   53 MB/sec
Use int primitives and HashMap          45 - 76 (ns)    36 MB/sec
Use int primitives and FastMap          58 - 93 (ns)    28 MB/sec
Use int primitives and TIntIntHashMap   18 - 28 (ns)    nonimal
Use int primitives and simple hash map   6 - 9 (ns)     nonimal 
like image 38
Peter Lawrey Avatar answered Jan 11 '23 18:01

Peter Lawrey