Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cache Performance in Hash Tables with Chaining vs Open Addressing

In this following website from geeksforgeeks.org it states that

Cache performance of chaining is not good as keys are stored using linked list. Open addressing provides better cache performance as everything is stored in same table.

So my questions are:

  • What causes chaining to have a bad cache performance?
  • Where is the cache being used?
  • Why would open addressing provide better cache performance as I cannot see how the cache comes into this?
  • Also what considerations what you take into account when deciding between chaining and linear probed open addressing and quadratic probed open addressing?
like image 578
Trajan Avatar asked Apr 07 '18 17:04

Trajan


People also ask

Why does open addressing provide better cache performance than chaining?

Open addressing provides better cache performance because all the data is stored in the same table only. It is easy to implement as no pointers are not involved. Different strategies to resolve collisions can be adopted as per the use case.

Which hash table has the best cache performance?

Linear probing has the best cache performance but suffers from clustering. One more advantage of Linear probing is easy to compute. Quadratic probing lies between the two in terms of cache performance and clustering. Double hashing has poor cache performance but no clustering.

Is open addressing or chaining faster?

Open-addressing is usually faster than chained hashing when the load factor is low because you don't have to follow pointers between list nodes.

What is the advantage of chained hash table over open addressing?

Deletion is easier, because chained hash table uses linked list.


1 Answers

Sorry, due to quite wide questions, the answers also will be quite generic with some links for more detailed information.

It is better to start with the question:

Where is the cache being used?

On modern CPUs, cache is used everywhere: to read program instructions and read/write data in memory. On most CPUs cache is transparent, i.e. there is no need to explicitly manage the cache.

Cache is much faster than the main memory (DRAM). Just to give you a perspective, accessing data in Level 1 cache is ~4 CPU cycles, while accessing the DRAM on the same CPU is ~200 CPU cycles, i.e. 50 times faster.

Cache operate on small blocks called cache lines, which are usually 64 bytes long.

More info: https://en.wikipedia.org/wiki/CPU_cache

What causes chaining to have a bad cache performance?

Basically, chaining is not cache friendly. It is not only about this case in the hash tables, same issue with "classical" lists.

Hash keys (or list nodes) are far away from each other, so each key access generates a "cache miss", i.e. slow DRAM access. So checking 10 keys in a chain takes 10 DRAM accesses, i.e. 200 x 10 = 2000 cycles for our generic CPU.

The address of the next key is not known until a next pointer is read in the current key, so there is not much room for an optimization...

Why would open addressing provide better cache performance as I cannot see how the cache comes into this?

Linear probing is cache friendly. Keys are "clustered" together, so once we accessed the first key (slow DRAM access), most probably the next key will be already in cache, since the cache line is 64 bytes. So accessing the same 10 keys with open addressing takes 1 DRAM access and 9 cache accesses, i.e. 200 x 1 + 9 x 4 = 236 cycles for our generic CPU. It is much faster than 2000 cycles for chained keys.

Also, since we access the memory in predictable manner, there is a room for optimizations like cache prefetching: https://en.wikipedia.org/wiki/Cache_prefetching

Also what considerations what you take into account when deciding between chaining and linear probed open addressing and quadratic probed open addressing?

Chaining or linear probing is not a good sign anyway. So the first thing I would consider is to make sure the probability of collisions is at minimum by using a good hash function and reasonable hash size.

The second thing I would consider is a ready to use solution. Sure, there are still might be some rare cases when you need your own implementation...

Not sure about the language, but here is blazingly fast hash table implementation with BSD license: http://dpdk.org/browse/dpdk/tree/lib/librte_hash/rte_cuckoo_hash.h

So, if you still need your own hash table implementation and you do care about performance, the next quite easy thing to implement would be to use cache aligned buckets instead of plain hash elements. It will waste few bytes per each element (i.e. each hash table element will be 64 bytes long), but in case of a collision there will be some fast storage for at least few keys. The code to manage those buckets will be also a bit more complicated, so it is a thing to consider if it is worth for you to bother...

like image 90
Andriy Berestovskyy Avatar answered Sep 30 '22 19:09

Andriy Berestovskyy