So I'm reading up about hash tables, hash functions etc. I was intrigued to read on wikipedia about how "dynamic perfect hashing" involves using a second hash table as the data structure to store multiple values within a particular bucket.
Where I get lost however, is when it comes to how a universal hash function is selected to perform the hashing for that second hash table. Can anyone explain how this universal hash function is determined from the values being stored in the bucket? I vaguely following the reasoning and logic in wikipedia's "universal hash function" page, but am struggling to have any intuition on it. In particular, how do these functions guarantee no clashes? Or atleast, if they're disposed of and a new one generated if a clash is detected, how do we know this can be done in a realistic amount of time if at all?
Ladybird book explanation please?
Perfect hashing means that read access takes constant time even in the worst case.
For inserting keys there are no worst-case guarantees, the time bounds are only true on average (or maybe amortized).
To make insertion fast enough the second level hash table is chosen very large for the number of keys (k2), large enough so that collisions become sufficiently unlikely. This is not a problem w.r.t. size because the first level hash distributes keys evenly so that on average second level hash tables are still small.
The hash function for the second level tables are chosen at random from a set of parameterized hash functions.
How about watching some MIT lectures? :)
MIT’s Introduction to Algorithms, Lectures 7 and 8: Hashing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With