Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to find a pair of adjacent digits with different digits around them

I'm a beginner to regex and I am trying to make an expression to find if there are two of the same digits next to each other, and the digit behind and in front of the pair is different.

For example,

123456678 should match as there is a double 6,

1234566678 should not match as there is no double with different surrounding numbers. 12334566 should match because there are two 3s.

So far i have this which works only with 1, and as long as the double is not at the start or end of the string, however I can deal with that by adding a letter at the start and end.

^.*([^1]11[^1]).*$ 

I know i can use [0-9] instead of the 1s but the problem is having them all be the same digit.

Thank you!

like image 574
Archie Adams Avatar asked Jun 19 '20 17:06

Archie Adams


People also ask

How do I match a range of numbers in regex?

With regex you have a couple of options to match a digit. You can use a number from 0 to 9 to match a single choice. Or you can match a range of digits with a character group e.g. [4-9]. If the character group allows any digit (i.e. [0-9]), it can be replaced with a shorthand (\d).

Which regex matches one or more digits?

+: one or more ( 1+ ), e.g., [0-9]+ matches one or more digits such as '123' , '000' . *: zero or more ( 0+ ), e.g., [0-9]* matches zero or more digits. It accepts all those in [0-9]+ plus the empty string.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.

What does \\ mean in regex?

\\. matches the literal character . . the first backslash is interpreted as an escape character by the Emacs string reader, which combined with the second backslash, inserts a literal backslash character into the string being read. the regular expression engine receives the string \. html?\ ' .


1 Answers

With regex, it is much more convenient to use a PyPi regex module with the (*SKIP)(*FAIL) based pattern:

import regex rx = r'(\d)\1{2,}(*SKIP)(*F)|(\d)\2' l = ["123456678", "1234566678"] for s in l:   print(s, bool(regex.search(rx, s)) ) 

See the Python demo. Output:

123456678 True 1234566678 False 

Regex details

  • (\d)\1{2,}(*SKIP)(*F) - a digit and then two or more occurrences of the same digit
  • | - or
  • (\d)\2 - a digit and then the same digit.

The point is to match all chunks of identical 3 or more digits and skip them, and then match a chunk of two identical digits.

See the regex demo.

like image 72
Wiktor Stribiżew Avatar answered Sep 20 '22 01:09

Wiktor Stribiżew