Cache Performance in Hash Tables with Chaining vs Open Addressing

1 Answers

Sorry, due to quite wide questions, the answers also will be quite generic with some links for more detailed information.

It is better to start with the question:

Where is the cache being used?

On modern CPUs, cache is used everywhere: to read program instructions and read/write data in memory. On most CPUs cache is transparent, i.e. there is no need to explicitly manage the cache.

Cache is much faster than the main memory (DRAM). Just to give you a perspective, accessing data in Level 1 cache is ~4 CPU cycles, while accessing the DRAM on the same CPU is ~200 CPU cycles, i.e. 50 times faster.

Cache operate on small blocks called cache lines, which are usually 64 bytes long.

More info: https://en.wikipedia.org/wiki/CPU_cache

What causes chaining to have a bad cache performance?

Basically, chaining is not cache friendly. It is not only about this case in the hash tables, same issue with "classical" lists.

Hash keys (or list nodes) are far away from each other, so each key access generates a "cache miss", i.e. slow DRAM access. So checking 10 keys in a chain takes 10 DRAM accesses, i.e. 200 x 10 = 2000 cycles for our generic CPU.

The address of the next key is not known until a next pointer is read in the current key, so there is not much room for an optimization...

Why would open addressing provide better cache performance as I cannot see how the cache comes into this?

Linear probing is cache friendly. Keys are "clustered" together, so once we accessed the first key (slow DRAM access), most probably the next key will be already in cache, since the cache line is 64 bytes. So accessing the same 10 keys with open addressing takes 1 DRAM access and 9 cache accesses, i.e. 200 x 1 + 9 x 4 = 236 cycles for our generic CPU. It is much faster than 2000 cycles for chained keys.

Also, since we access the memory in predictable manner, there is a room for optimizations like cache prefetching: https://en.wikipedia.org/wiki/Cache_prefetching

Also what considerations what you take into account when deciding between chaining and linear probed open addressing and quadratic probed open addressing?

Chaining or linear probing is not a good sign anyway. So the first thing I would consider is to make sure the probability of collisions is at minimum by using a good hash function and reasonable hash size.

The second thing I would consider is a ready to use solution. Sure, there are still might be some rare cases when you need your own implementation...

Not sure about the language, but here is blazingly fast hash table implementation with BSD license: http://dpdk.org/browse/dpdk/tree/lib/librte_hash/rte_cuckoo_hash.h

So, if you still need your own hash table implementation and you do care about performance, the next quite easy thing to implement would be to use cache aligned buckets instead of plain hash elements. It will waste few bytes per each element (i.e. each hash table element will be 64 bytes long), but in case of a collision there will be some fast storage for at least few keys. The code to manage those buckets will be also a bit more complicated, so it is a thing to consider if it is worth for you to bother...

answered Sep 30 '22 19:09

Andriy Berestovskyy

Related questions
                            
                                Fastest and lightest way to get the current time in milliseconds with JS Date object
                            
                                What is faster? Running an empty function or checking if function is undefined? [closed]
                            
                                Counter cache column in PostgreSQL
                            
                                LinkedList vs ArrayList on a specific android example [duplicate]
                            
                                AsyncLayoutInflator Pitfalls
                            
                                Sum amount last 6 month prior to the date of transaction
                            
                                Efficient alternatives for exposing a Collection
                            
                                How to organize minification and packaging of css and js files to speed up website?
                            
                                Is it faster to search for a large string in a DB by its hashcode?
                            
                                Implementing a matrix, which is more efficient - using an Array of Arrays (2D) or a 1D array?
                            
                                Performance differences between P/Invoke and C++ Wrappers
                            
                                Does SET NOCOUNT ON really make that much of a performance difference
                            
                                Why is my sinatra website so slow?
                            
                                jQuery validate large forms - script running slowly
                            
                                How to fix this error "GC overhead limit exceeded in Eclipse"
                            
                                read huge text file line by line in C++ with buffering
                            
                                C++ operator overload performance issue
                            
                                Identify slow-to-compile function
                            
                                Efficiently pick n random elements from PHP array (without shuffle)
                            
                                Java method call performance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cache Performance in Hash Tables with Chaining vs Open Addressing

Tags:

performance

hashtable

linked-list

caching

hash

Trajan

People also ask

1 Answers

Andriy Berestovskyy

Recent Activity

Donate For Us