Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression to detect repetition within a string

Tags:

c#

regex

Is it possible to detect repeated number patterns with a regular expression?

So for example, if I had the following string "034503450345", would it be possible to match the repeated sequence 0345? I have a feeling this is beyond the scope of regex, but I thought I would ask here anyway to see if I have missed something.

like image 393
Mark Withers Avatar asked Jun 03 '09 09:06

Mark Withers


2 Answers

This expression will match one or more repeating groups:

(.+)(?=\1+)


Here is the same expression broken down, (using commenting so it can still be used directly as a regex).

(?x)  # enable regex comment mode
(     # start capturing group
.+    # one or more of any character (excludes newlines by default)
)     # end capturing group
(?=   # begin lookahead
\1+   # match one or more of the first capturing group
)     # end lookahead


To match a specific pattern, change the .+ to that pattern, e.g. \d+ for one or more numbers, or \d{4,} to match 4 or more numbers.

To match a specific number of the pattern, change \1+, e.g to \1{4} for four repetitions.

To allow the repetition to not be next to each other, you can add .*? inside the lookahead.

like image 194
Peter Boughton Avatar answered Oct 13 '22 13:10

Peter Boughton


Yes, you can - here's a Python test case

import re
print re.search(r"(\d+).*\1", "8034503450345").group(1)
# Prints 0345

The regular expression says "find some sequence of digits, then any amount of other stuff, then the same sequence again."

On a barely-related note, here's one of my favourite regular expressions - a prime number detector:

import re
for i in range(2, 100):
    if not re.search(r"^(xx+)\1+$", "x"*i):
        print i
like image 23
RichieHindle Avatar answered Oct 13 '22 13:10

RichieHindle