For a given hash value, the indices generated by linear probing are as follows:
h
, h+1
, h+2
, h+3
, etc..
For a given hash value, the indices generated by quadratic probing are as follows:
h
, h+1
, h+4
, h+9
, etc..
There will be cluster formed in case of linear but not in case of quadratic.
But how come quadratic is more efficient than linear when both processes(methods) require taking same number of steps for insertion or searching. thanks!
Quadratic probing can be a more efficient algorithm in an open addressing table, since it better avoids the clustering problem that can occur with linear probing, although it is not immune.
Quadratic probing tends to be more efficient than linear prob- ing if the number of items to be inserted is not greater than the half of the array, because it eliminates clustering problem. At best case, each of the technique works at O(1). But this is only achieved when there is no collision.
Quadratic probing has secondary clustering. This occurs when 2 keys hash to the same location, they have the same probe sequence. So, it may take many attempts before an insertion is being made. Also probe sequences do not probe all locations in the table.
Double hashing is the most efficient collision technique, when the size of the table is prime number and it avoids clustering. Quadratic probing is also efficient but only when the records to be stored are not greater than the half of the table.
The efficiency depends on the kinds of clustering formed by the linear probing and quadratic probing.
Linear probing forms Primary Clustering which once formed, the bigger the cluster gets, the faster it grows. This reduces the performance severely. Robert Lafore has given a nice example: it's like the crowd that gathers when someone faints at the shopping mall. The first arrivals come because they saw the victim fall; later arrivals gather because they wondered what everyone else was looking at. The larger the crowd grows, the more people are attracted to it.
Where as Quadratic probing forms Secondary Clustering. It is an attempt to keep clusters from forming. The idea is to probe more widely separated cells, instead of those adjacent to the primary hash site. Following the analogy, it tries to prevent the first arrivals to avoid forming the crowd. Secondary Clustering is more subtle and not as severe in terms of performance compared to Primary Clustering.
You will stop searching the table when you hit an empty slot as you know that if you hit an empty slot, then the value you are looking for will not be in the hash table. Because of reduced clustering you will be more likely to hit an empty slot and stop searching. In addition, because of reduced clustering, you will be more likely when inserting to find an empty slot, causing in return to be able to more quickly search for that value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With