Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obtaining a k-wise independent hash function

I need to use a hash function which belongs to a family of k-wise independent hash functions. Any pointers on any library or toolkit in C, C++ or python which can generate a set of k-wise independent hash functions from which I can pick a function.

Background: I am trying to implement this algorithm here: http://researcher.watson.ibm.com/researcher/files/us-dpwoodru/knw10b.pdf for the Distinct Elements problem.

I have looked at this thread: Generating k pairwise independent hash functions which mentions using Murmur hash to generate a pairwise independent hash function. I was wondering if there is anything similar for k-wise independent hash functions. If there is none available, would it be possible for me to construct such a set of k-wise independent hash functions.

Thanks in advance.

like image 513
vkmv Avatar asked Apr 29 '13 17:04

vkmv


1 Answers

The simplest k-wise independent hash function (mapping positive integer x < p to one of m buckets) is just

h

where p is some big random prime (261-1 will work) and ai are some random positive integers less than p, a0 > 0.

2-wise independent hash: h(x) = (ax + b) % p % m

again, p is prime, a > 0, a,b < p (i.e. a can't be zero but b can when that is a random choice)

These formulas define families of hash functions. They work (in theory) if you select a hash function randomly from corresponding family (i.e. if you generate random a's and b) each time you run your algorithm.

like image 82
podshumok Avatar answered Sep 28 '22 05:09

podshumok