Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using regex to find arbitrary length consecutive blocks

I have a string containing ones and zeroes. I want to determine if there are substrings of 1 or more characters that are repeated at least 3 consecutive times. For example, the string '000' has a length 1 substring consisting of a single zero character that is repeated 3 times. The string '010010010011' actually has 3 such substrings that each are repeated 3 times ('010', '001', and '100').

Is there a regex expression that can find these repeating patterns without knowing either the specific pattern or the pattern's length? I don't care what the pattern is nor what its length is, only that the string contains a 3-peat pattern.

like image 610
sizzzzlerz Avatar asked Dec 28 '11 17:12

sizzzzlerz


2 Answers

Here's something that might work, however, it will only tell you if there is a pattern repeated three times, and (I don't think) can't be extended to tell you if there are others:

     /(.+).*?\1.*?\1/

Breaking that out:

   (.+)          matches any 1 or more characters, starting anywhere in the string
   .*?           allows any length of interposing other characters (0 or more)
   \1            matches whatever was captured by the (...+) parentheses
   .*?           0 or more of anything
   \1            the original pattern, again

If you want the repetitions to occur immediately adjacent, then instead use

     /(.+)\1\1/

… as suggested by @Buh Buh — the \1 vs. $1 notation may vary, depending on your regexp system.

like image 54
BRPocock Avatar answered Oct 11 '22 11:10

BRPocock


(.+)\1\1

The \ might be a different charactor depending on your language choice. This means match any string then try to match it again twice more.

The \1 means repeat the 1st match.

like image 45
Buh Buh Avatar answered Oct 11 '22 09:10

Buh Buh