I came across the regular expression not containing 101 as follows:
0∗1∗0∗+(1+00+000)∗+(0+1+0+)∗
I was unable to understand how the author come up with this regex. So I just thought of string which did not contain 101:
01000100
I seems that above string will not be matched by above regex. But I was unsure. So tried translating to equivalent pcre regex on regex101.com, but failed there too (as it can be seen my regex does not even matches string containing single 1.
Whats wrong with my translation? Is above regex indeed correct? If not what will be the correct regex?
All strings not containing the substring 101: 0*(1*000*)*1*0*. Notice that a 1 may be followed by either a 1 or by a 00, and this pattern can be repeated as many times as we want. This pattern is expressed in (1*000*)*.
L={w∈{0,1}∣w does not contain 00 as a substring}. I've tried various things, but I can't seem to get the correct regular expression.
Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.
Regex101.com lets us build expressions fast and debug along the way. We can paste in a set of data and then, through trial and error, build a expression that does what we want. The tool makes it clear if data is matching our expression or not.
Here is a bit shorter expression ^0*(1|00+)*0*$
https://www.regex101.com/r/gG3wP5/1
Explanation:
(1|00+)*
we can mix zeroes and ones as long as zeroes occur in groups^0*...0*$
we can have as many zeroes as we want in prefix/suffixDirect translation of the original regexp would be like
^(0*1*0*|(1|00|000)*|(0+1+0+)*)$
Update
This seems like artificially complicated version of the above regexp:
(1|00|000)*
is the same as (1|00+)*
0
, 01..
, and ..10
0*1*0*
doesn't match strings with 101
inside, but matches 0
and some of 01..
, and ..10
01..
, and ..10
which have 0
& 1
mixed inside, e.g. 01001..
or ..10010
(0+1+0+)*
matches some of the remaining cases but there are still some valid strings unmatched
10010
is the shortest string that is not matched by all of the cases.So, this solution is overly complicated and not complete.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With