Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding a pattern in a binary string

I'm trying to find a repeating pattern in a string of binary digits.

eg. 0010010010 or 1110111011 = ok

not. 0100101101 = bad

The strings are 10 digits long (as above) & i guess 2 iterations of the 'pattern' are the minimum.

I started manually setting a 'bank' of patterns that the program could match it with but there must be a better way using an algorithm?

Searching got me nowhere - i think the language & terminology i'm using is incorrect..

like image 820
boomshanka Avatar asked May 10 '13 03:05

boomshanka


2 Answers

Quite a challenge. What about this function?

function findPattern(n) {
    var maxlen = parseInt(n.length/2);
    NEXT:
    for(var i=1; i<=maxlen; ++i) {
        var len=0, k=0, prev="", sub;
        do {
            sub = n.substring(k,k+i);
            k+= i;
            len = sub.length;
            if(len!=i) break;
            if(prev.length && sub.length==i && prev!=sub) continue NEXT;
            if(!prev.length) prev = sub;
        } while(sub.length);
        var trail = n.substr(n.length-len);
        if(!len || len && trail==n.substr(0,len)) return n.substr(0,i);
    }
    return false;
}

It even works for any length strings with any contents. See the fiddle

Inspired by Jack's and Zim-Zam's answer, here is the list for brute force algorithm:

var oksubs =
["001","010","011","100","101","110",
"0001","0010","0011","0100","0101","0110","0111",
"1000","1001","1010","1011","1100","1101","1110",
"00000","00001","00011","00101","00110","00111","01000",
"01001","01010","01011","01100","01101","01110","01111",
"10000","10001","10011","10101","10110","10111","11000","11001",
"11010","11011","11100","11101","11110","11111"];

Thanks to Jack's comment, here is both short and effective code:

function findPattern(n) {
    var oksubs = [n.substr(0,5),n.substr(0,4),n.substr(0,3)];
    for(var i=0; i<oksubs.length; ++i) {
        var sub = oksubs[i];
        if((sub+sub+sub+sub).substr(0,10)==n) return sub;
    }
    return false;
}
like image 142
Jan Turoň Avatar answered Oct 20 '22 01:10

Jan Turoň


You've only got 2^10 patterns, that's a small enough number that you can just precompute all of the valid strings and store the results in a 1024-element boolean array; if a string is valid, then convert it to an integer (e.g. "0000001111" = 15) and store "true" in the resulting array index. To check if a string is valid, convert it to an integer and look up the index in the precomputed boolean array.

If your strings are longer than 10 digits then you'll need to be more clever about determining if a string is valid, but since you only have 1024 strings you might as well be lazy about this.

like image 45
Zim-Zam O'Pootertoot Avatar answered Oct 20 '22 00:10

Zim-Zam O'Pootertoot