What are the best algorithms available to find longest repeating patterns of characters in a string using .net?
I guess that you speak about pattern discovery. Take a look at some elementary aproach (source)
private static Dictionary<string, int> FindPatterns(string value) {
List<string> patternToSearchList = new List<string>();
for (int i = 0; i < value.Length; i++) {
for (int j = 2; j <= value.Length / 2; j++) {
if (i + j <= value.Length) {
patternToSearchList.Add(value.Substring(i, j));
}
}
}
// pattern matching
Dictionary<string, int> results = new Dictionary<string, int>();
foreach (string pattern in patternToSearchList) {
int occurence = Regex.Matches(value, pattern, RegexOptions.IgnoreCase).Count;
if (occurence > 1) {
results[pattern] = occurence;
}
}
return results;
}
static void Main(string[] args) {
Dictionary<string, int> result = FindPatterns("asdxgkeopgkajdflkjbpoijadadafhjkafikeoadkjhadfkjhocihakeo");
foreach (KeyValuePair<string, int> res in result.OrderByDescending(r => r.Value)) {
Console.WriteLine("Pattern:" + res.Key + " occurence:" + res.Value.ToString());
}
Console.Read();
}
The algorithm consist of 2 stages.
It is used Regex for pattern matching. There are other more advanced algorithms. These algorithms are enlisted on address http://www-igm.univ-mlv.fr/~lecroq/string/ However, code samples are written in C. Also you'd take a look on Boyer-Moore algorithm for pattern matching, written in C#
Pseudocode:
For N=1 to InputString.Length-1
rotatedString = RotateStringByN(InputString,N)
For N=0 to InputString.Length-1
StringResult[N] = if (rotatedString[N]==InputString[N]) then
InputString[N]
else
Convert.ToChar(0x0).ToString()
RepeatedStrings[] = String.Split(StringResult, Convert.ToChar(0x0).ToString())
SaveLongestStringFrom(RepeatedStrings)
... Or just look here at SO thread for other solutions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With