I'm looking for a function that maps a multi-set of integers to an integer, hopefully with some kind of guarantee like pairwise independence.
Ideally, memory usage would be constant, and the hash value could be updated in O(1) time after an insert/delete. (This forbids doing something like sorting the integers and using a hash function like h(x) = h_1(x_1, h_2(x_2, h_3(x_3, x_4))).)
XORing hashes together doesn't work because h({1,1,2}) = h({2})
I think multiplying hashes together modulo a prime might work if the underlying hash function had an unrealistically strong guarantee, such as n-independence.
A good hash function to use with integer key values is the mid-square method. The mid-square method squares the key value, and then takes out the middle r bits of the result, giving a value in the range 0 to 2r−1. This works well because most or all bits of the key value contribute to the result.
Characteristics of a Good Hash Function. There are four main characteristics of a good hash function: 1) The hash value is fully determined by the data being hashed. 2) The hash function uses all the input data. 3) The hash function "uniformly" distributes the data across the entire set of possible hash values.
A good hash function should have the following properties: Efficiently computable. Should uniformly distribute the keys (Each table position equally likely for each key)
The most common hash functions used in digital forensics are Message Digest 5 (MD5), and Secure Hashing Algorithm (SHA) 1 and 2.
I asked this same question on cstheory.stackexchange.com and got a good answer:
https://cstheory.stackexchange.com/questions/3390/is-there-a-hash-function-for-a-collection-i-e-multi-set-of-integers-that-has
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With