Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perfect hash function?

While reading the pigeonhole principle on Wikipedia, I come across - "collisions are inevitable in a hash table because the number of possible keys exceeds the number of indices in the array. No hashing algorithm, no matter how clever, can avoid these collisions". But isn't gperf doing this exactly?

Please enlighten.

like image 328
user138645 Avatar asked Nov 17 '10 11:11

user138645


People also ask

What do you mean by perfect hash function?

Perfect hashing is defined as a model of hashing in which any set of n elements can be stored in a hash table of equal size and can have lookups performed in constant time.

What is a perfect hash in a hash table?

Definition: A hash function that maps each different key to a distinct integer. Usually all possible keys must be known beforehand. A hash table that uses a perfect hash has no collisions. Formal Definition: A function f is perfect for a set of keys K iff ∀ j, k ∈ K f(j) = f(k) → j = k.

What is perfect hash function Java?

A perfect hash function maps every element to a distinct integer with no collisions — in mathematical terms it's an injective function. A minimal perfect hash (mph) function maps n keys to the n consecutive integers [0, n-1].


2 Answers

gperf creates the hash function and the hash table basing on a predefined list of strings.

It therefore appears to me that gperf creates the hashes long enough so that there are enough possibilities.
That's what you can do only if you know every possible key upfront - which is an assumption which didn't hold for the description in the wikipedia's entry, which was apparently related to a "non-constant" hash-table.

like image 54
Kos Avatar answered Sep 18 '22 04:09

Kos


From the gperf's website: "For a given list of strings, it produces a hash function and hash table,..." - which means it has to know all the strings previously in order to prepare a function, that works without collisions.

Usual hash functions you are using in general programming languages are able to handle any strings as input one after another (the list is not given at once) and therefore they can produce collisions.

like image 20
eumiro Avatar answered Sep 21 '22 04:09

eumiro