Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding all occurrences of alternating digits using regular expressions

I would like to find all alternating digits in a string using regular expressions. An alternating digit is defined as two equal digits having a digit in between; for example, 1212 contains 2 alternations (121 and 212) and 1111 contains 2 alternations as well (111 and 111). I have the following regular expression code:

s = "1212"
re.findall(r'(\d)(?:\d)(\1)+', s)

This works for strings like "121656", but not "1212". This is a problem to do with overlapping matches I think. How can I deal with that?

like image 207
user1879926 Avatar asked Jan 03 '16 05:01

user1879926


2 Answers

(?=((\d)\d\2))

Use lookahead to get all overlapping matches. Use re.findall and get the first element from the tuple. See the demo:

https://regex101.com/r/fM9lY3/54

like image 60
vks Avatar answered Nov 05 '22 02:11

vks


You can use a lookahead to allow for overlapping matches:

r'(\d)(?=(\d)\1)'

To reconstruct full matches from this:

matches = re.findall(r'(\d)(?=(\d)\1)', s)
[a + b + a for a, b in matches]

Also, to avoid other Unicode digits like ١ from being matched (assuming you don’t want them), you should use [0-9] instead of \d.

like image 30
Ry- Avatar answered Nov 05 '22 02:11

Ry-