Given a string abcd
how can I create a unique hashing method that will hash those 4 characters to match bcad
or any other permutation of the letters abcd
?
Currently I have this code
long hashString(string a) {
long hashed = 0;
for(int i = 0; i < a.length(); i++) {
hashed += a[i] * 7; // Timed by a prime to make the hash more unique?
}
return hashed;
}
Now this will not work becasue ad
will hash with bc
.
I know you can make it more unique by multiplying the position of the letter by the letter itself hashed += a[i] * i
but then the string will not hash to its permutations.
Is it possible to create a hash that achieves this?
Edit
Some have suggested sorting the strings before you hash them. Which is a valid answer but the sorting would take O(nlog) time and I am looking for a hash function that runs in O(n) time.
I am looking to do this in O(1) memory.
Create an array of 26 integers, corresponding to letters a-z
. Initialize it to 0. Scan the string from beginning to end, and increment the array element corresponding to the current letter. Note that up to this point the algorithm has O(n)
time complexity and O(1)
space complexity (since the array size is a constant).
Finally, hash the contents of the array using your favorite hash function.
The basic thing you can do is sort the strings before applying the hash function. So, to compute the hash of "adbc" or "dcba" you instead compute the hash of "abcd".
If you want to make sure that there are no collisions in your hash function, then the only way is to have the hash result be a string. There are many more strings than there are 32-bit (or 64-bit) integers so collisions are innevitable (collisions are unlikely with a good hash function though).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With