I need a hash function that takes a few (eg. 2 or 3) unsigned integers as input, and returns a floating point value between -1 and +1.
The collection of these returned values must be evenly distributed. A sequence of outputs from the function must appear to be a random sequence, even if the input numbers are sequential. Also the faster the better, i'm calling it a LOT of times.
I hope this isn't too much to ask :S...
Murmurhash is a very good (strong) and fast hash function which has had some serious testing done on it.
http://sites.google.com/site/murmurhash/
While it is not dedicated to integers per se, it can be quickly adjusted to do so. I have such an alternate formulation which might be more convenient for you if your words are not sequently laid out in memory:
#define MURMURHASH2A_R 24 #define MURMURHASH2A_MULTIPLIER 0x5bd1e995 #define MURMURHASH2A_SEED 2166136261U // No seed suggested, so using FNV32_OFFSET_BASIS #define murmurhash2a_init(h) do { h = MURMURHASH2A_SEED; } while (0) #define murmurhash2a_update(h,word) \ do { \ u_int mmh2ak = (word) * MURMURHASH2A_MULTIPLIER; \ mmh2ak ^= mmh2ak >> MURMURHASH2A_R; \ mmh2ak *= MURMURHASH2A_MULTIPLIER; \ h *= MURMURHASH2A_MULTIPLIER; \ h ^= mmh2ak; \ } while (0) #define murmurhash2a_final(h) \ do { \ h ^= h >> 13; \ h *= MURMURHASH2A_MULTIPLIER; \ h ^= h >> 15; \ } while (0) u_int hash; murmurhash2a_init(hash); murmurhash2a_update(hash,firstint); murmurhash2a_update(hash,secondint); [...] murmurhash2a_final(hash);
Obviously this is returning 0-2^32-1. There is a 64 bit version on the murmurhash site. Conversion of integer to a float over a range is left as an excercise (in division) for the reader.
You can employ standard scheme for such tasks: (a0 + Q*a1 + Q^2*a2 + Q^3*a3 + ...) % M
where M
is a very large prime number and Q
is coefficient of your choice.
Once you have random enough hash in range [0, M)
, converting it to floating point number [-1, 1]
is trivial.
Or you can remove % M
and allow integer overflow to happen, although I'm not sure how secure it is (from 'evenly distributed' perspective).
A sequence of outputs from the function must appear to be a random sequence, even if the input numbers are sequential.
For this you can instead of ai
use ai*ai
in the expression. Anyway, here's the simple implementation in Java.
double hash(int... a) {
int Q = 433494437;
int result = 0;
for (int n : a) {
result = result * Q + n * n;
}
result *= Q;
return (double) result / Integer.MIN_VALUE;
}
Output does look random even for consecutive numbers. You can also use 64-bit integer for more precision.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With