Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression not containing 101

Tags:

regex

I came across the regular expression not containing 101 as follows:

010+(1+00+000)+(0+1+0+)

I was unable to understand how the author come up with this regex. So I just thought of string which did not contain 101:

01000100

I seems that above string will not be matched by above regex. But I was unsure. So tried translating to equivalent pcre regex on regex101.com, but failed there too (as it can be seen my regex does not even matches string containing single 1.

Whats wrong with my translation? Is above regex indeed correct? If not what will be the correct regex?

like image 601
Mahesha999 Avatar asked Jan 17 '16 09:01

Mahesha999


People also ask

What is the regular expression of a string not containing 101 as a substring?

All strings not containing the substring 101: 0*(1*000*)*1*0*. Notice that a 1 may be followed by either a 1 or by a 00, and this pattern can be repeated as many times as we want. This pattern is expressed in (1*000*)*.

Which of the following regular expression does not contain 00 as substring?

L={w∈{0,1}∣w does not contain 00 as a substring}. I've tried various things, but I can't seem to get the correct regular expression.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.

How does regex 101 work?

Regex101.com lets us build expressions fast and debug along the way. We can paste in a set of data and then, through trial and error, build a expression that does what we want. The tool makes it clear if data is matching our expression or not.


1 Answers

Here is a bit shorter expression ^0*(1|00+)*0*$

https://www.regex101.com/r/gG3wP5/1

Explanation:

  • (1|00+)* we can mix zeroes and ones as long as zeroes occur in groups
  • ^0*...0*$ we can have as many zeroes as we want in prefix/suffix

Direct translation of the original regexp would be like

^(0*1*0*|(1|00|000)*|(0+1+0+)*)$

Update
This seems like artificially complicated version of the above regexp:

  • (1|00|000)* is the same as (1|00+)*
    • it is almost the solution, but it does not match strings 0, 01.., and ..10
  • 0*1*0* doesn't match strings with 101 inside, but matches 0 and some of 01.., and ..10
    • we still need to match those of 01.., and ..10 which have 0 & 1 mixed inside, e.g. 01001.. or ..10010
  • (0+1+0+)* matches some of the remaining cases but there are still some valid strings unmatched
    • e.g. 10010 is the shortest string that is not matched by all of the cases.

So, this solution is overly complicated and not complete.

like image 125
max taldykin Avatar answered Sep 27 '22 17:09

max taldykin