Thinking about this question on testing string rotation, I wondered: Is there was such thing as a circular/cyclic hash function? E.g.
h(abcdef) = h(bcdefa) = h(cdefab) etc
Uses for this include scalable algorithms which can check n strings against each other to see where some are rotations of others.
I suppose the essence of the hash is to extract information which is order-specific but not position-specific. Maybe something that finds a deterministic 'first position', rotates to it and hashes the result?
It all seems plausible, but slightly beyond my grasp at the moment; it must be out there already...
I'd go along with your deterministic "first position" - find the "least" character; if it appears twice, use the next character as the tie breaker (etc). You can then rotate to a "canonical" position, and hash that in a normal way. If the tie breakers run for the entire course of the string, then you've got a string which is a rotation of itself (if you see what I mean) and it doesn't matter which you pick to be "first".
So:
"abcdef" => hash("abcdef")
"defabc" => hash("abcdef")
"abaac" => hash("aacab") (tie-break between aa, ac and ab)
"cabcab" => hash("abcabc") (it doesn't matter which "a" comes first!)
Update: As Jon pointed out, the first approach doesn't handle strings with repetition very well. Problems arise as duplicate pairs of letters are encountered and the resulting XOR is 0. Here is a modification that I believe fixes the the original algorithm. It uses Euclid-Fermat sequences to generate pairwise coprime integers for each additional occurrence of a character in the string. The result is that the XOR for duplicate pairs is non-zero.
I've also cleaned up the algorithm slightly. Note that the array containing the EF sequences only supports characters in the range 0x00 to 0xFF. This was just a cheap way to demonstrate the algorithm. Also, the algorithm still has runtime O(n) where n is the length of the string.
static int Hash(string s)
{
int H = 0;
if (s.Length > 0)
{
//any arbitrary coprime numbers
int a = s.Length, b = s.Length + 1;
//an array of Euclid-Fermat sequences to generate additional coprimes for each duplicate character occurrence
int[] c = new int[0xFF];
for (int i = 1; i < c.Length; i++)
{
c[i] = i + 1;
}
Func<char, int> NextCoprime = (x) => c[x] = (c[x] - x) * c[x] + x;
Func<char, char, int> NextPair = (x, y) => a * NextCoprime(x) * x.GetHashCode() + b * y.GetHashCode();
//for i=0 we need to wrap around to the last character
H = NextPair(s[s.Length - 1], s[0]);
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
{
H ^= NextPair(s[i - 1], s[i]);
}
}
return H;
}
static void Main(string[] args)
{
Console.WriteLine("{0:X8}", Hash("abcdef"));
Console.WriteLine("{0:X8}", Hash("bcdefa"));
Console.WriteLine("{0:X8}", Hash("cdefab"));
Console.WriteLine("{0:X8}", Hash("cdfeab"));
Console.WriteLine("{0:X8}", Hash("a0a0"));
Console.WriteLine("{0:X8}", Hash("1010"));
Console.WriteLine("{0:X8}", Hash("0abc0def0ghi"));
Console.WriteLine("{0:X8}", Hash("0def0abc0ghi"));
}
The output is now:
7F7D7F7F
7F7D7F7F
7F7D7F7F
7F417F4F
C796C7F0
E090E0F0
A909BB71
A959BB71
First Version (which isn't complete): Use XOR which is commutative (order doesn't matter) and another little trick involving coprimes to combine ordered hashes of pairs of letters in the string. Here is an example in C#:
static int Hash(char[] s)
{
//any arbitrary coprime numbers
const int a = 7, b = 13;
int H = 0;
if (s.Length > 0)
{
//for i=0 we need to wrap around to the last character
H ^= (a * s[s.Length - 1].GetHashCode()) + (b * s[0].GetHashCode());
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
{
H ^= (a * s[i - 1].GetHashCode()) + (b * s[i].GetHashCode());
}
}
return H;
}
static void Main(string[] args)
{
Console.WriteLine(Hash("abcdef".ToCharArray()));
Console.WriteLine(Hash("bcdefa".ToCharArray()));
Console.WriteLine(Hash("cdefab".ToCharArray()));
Console.WriteLine(Hash("cdfeab".ToCharArray()));
}
The output is:
4587590
4587590
4587590
7077996
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With