Why do we use linear probing in hash tables when there is separate chaining linked with lists?

Tags:

I recently learned about different methods to deal with collisions in hash tables and saw that the separate chaining with linked lists is always more time efficient than linear probing. For space efficiency, we allocate a predefined memory for linear probing which later on we might not use, but for separate chaining we use memory dynamically.

Is separate chaining with linked list more efficient than linear probing? If so, why do we then use linear probing at all?

585

asked May 23 '14 05:05

Adilli Adil

1 Answers

I'm surprised that you saw chained hashing to be faster than linear probing - in practice, linear probing is typically significantly faster than chaining. This is primarily due to locality of reference, since the accesses performed in linear probing tend to be closer in memory than the accesses performed in chained hashing.

There are other wins in linear probing. For example, insertions into a linear probing hash table don't require any new allocations (unless you're rehashing the table), so in applications like network routers where memory is scarce, it's nice to know that once the table is set up, the elements can be placed into it with no risk of a malloc fail.

One weakness of linear probing is that, with a bad choice of hash function, primary clustering can cause the performance of the table to degrade significantly. While chained hashing can still suffer from bad hash functions, it's less sensitive to elements with nearby hash codes, which don't adversely impact the runtime. Theoretically, linear probing only gives expected O(1) lookups if the hash functions are 5-independent or if there's sufficient entropy in the keys. There are many ways to address this, since as using the Robin Hood hashing technique or hopscotch hashing, both of which have significantly better worst-cases than vanilla linear probing.

The other weakness of linear probing is that its performance significantly degrades as the load factor approaches 1. You can address this either by rehashing periodically or by using the Robin Hood hashing technique described above.

Hope this helps!

195

answered Sep 23 '22 14:09

templatetypedef

Related questions
                            
                                What flags are enabled by -XX:+AggressiveOpts on Sun JRE 1.6u20?
                            
                                How to improve Golang compilation speed?
                            
                                How to handle Vue 2 memory usage for large data (~50 000 objects)
                            
                                Node.js console.log performance
                            
                                Translate SQL Azure DTU to IOPS? [closed]
                            
                                How to write fast (low level) code? [closed]
                            
                                Cycles/cost for L1 Cache hit vs. Register on x86?
                            
                                Static vs. non-static method
                            
                                Big O, what is the complexity of summing a series of n numbers?
                            
                                Ruby Benchmark module: meanings of "user", "system", and "real"?
                            
                                Clojure performance really bad on simple loop versus Java
                            
                                Is performance reduced when executing loops whose uop count is not a multiple of processor width?
                            
                                Benefits of using reserve() in a vector - C++
                            
                                Why does keras model predict slower after compile?
                            
                                Enable smooth scrolling for my website in all browsers
                            
                                MySQL query caching: limited to a maximum cache size of 128 MB?
                            
                                Can I (and do I ever want to) set the maximum heap size in .net?
                            
                                How can I make branchless code?
                            
                                How to efficiently (performance) remove many items from List in Java?
                            
                                C++ : Catch a divide by zero error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do we use linear probing in hash tables when there is separate chaining linked with lists?

Tags:

performance

algorithm

time-complexity

hashtable

hash

Adilli Adil

People also ask

1 Answers

templatetypedef

Recent Activity

Donate For Us