i have an two arrays : char data1[length] where length is a multiple of 8 i.e length can be 8, 16,24 ... The array contains binary data read from a file that is open in binary mode. I will keep reading from the file and everytime i read i will store the read value in a hash table. The disterbution of this binary data has a random distribution. I would like to hash each array and store them in a hash table in order to be able to look for the char with the specific data again. What would be a good hashing function to achive this task. Thanks
Please note that i am writing this in c++ and c so any language you choose to provide a solution for would be great.
If the data that you read is 8 bytes long and really distributed randomly, and your hashcode needs to be 32 bits, what about this:
uint32_t hashcode(const unsigned char *data) {
uint32_t hash = 0;
hash ^= get_uint32_le(data + 0);
hash ^= get_uint32_le(data + 4);
return hash;
}
uint32_t get_uint32_le(const unsigned char *data) {
uint32_t value = 0;
value |= data[0] << 0;
value |= data[1] << 8;
value |= data[2] << 16;
value |= data[3] << 24;
return value;
}
If you need more speed, this code can probably made a lot faster if you can guarantee that data
is always properly aligned to be interpreted as an const uint32_t *
.
I have successfully used MurmurHash3 in one of my projects.
Pros:
Cons:
It's a good possibility for use in e.g. a fast hash-table implementation...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With