I need to understand if a string is sufficiently random or not. Can anyone point me in the right direction?
I need to emulate a process behaviour, where a process copies itself to a temp location, renames itself to a random name, and executes itself. My ultimate goal is to detect such activity. As part of this work I need to test a process name, which is a string, for randomness. I understand that Kolmogorov complexity deals with this, but it is incomputable. What would be quick alternatives: variety of entropies, Lempel-Ziv compression level?
string s1 = "test process name"
string s2 = "hgoi4dFh3e905jv"
double sensitivity = 0.5; // user-defined variable, a subjective threshold of randomness
bool b1 = SeemsRandom(s1, sensitivity); // false
bool b2 = SeemsRandom(s2, sensitivity); // true
bool SeemsRandom(string input, double sensitivity)
{
...
}
You may want to try converting the string to a binary sequence and try using the Wald-Wolfowitz runs test which should be less complicated than Kolmogorov–Smirnov test
http://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With