How would you write a regular expression to define all strings of 0's and 1's that, as a binary number, represent an integer that is multiple of 3.
Some valid binary numbers would be:
11 110 1001 1100 1111
Using the DFA here we can make a regular expression the following way, where A, B, C represent the states of the DFA.
A = 1B + 0A
B = 1A + 0C
C = 1C + 0B
C = 1*0B // Eliminate recursion
B = 1A + 0(1*0B)
B = 01*0B + 1A
B = (01*0)*1A // Eliminate recursion
A = 1(01*0)*1A + 0A
A = (1(01*0)*1 + 0)A
A = (1(01*0)*1 + 0)* // Eliminate recursion
Resulting in a PCRE regex like:
/^(1(01*0)*1|0)+$/
Perl test/example:
use strict;
for(qw(
11
110
1001
1100
1111
0
1
10
111
)){
print "$_ (", eval "0b$_", ") ";
print /^(1(01*0)*1|0)+$/? "matched": "didnt match";
print "\n";
}
Outputs:
11 (3) matched
110 (6) matched
1001 (9) matched
1100 (12) matched
1111 (15) matched
0 (0) matched
1 (1) didnt match
10 (2) didnt match
111 (7) didnt match
When you divide a number by three, there are only three possible remainders (0, 1 and 2). What you're aiming at is to ensure the remainder is 0, hence a multiple of three.
This can be done by an automaton with the three states:
Now think of any non-negative number (that's our domain) and multiply it by two (tack a binary zero on to the end). The transitions for that are:
ST0 -> ST0 (3n * 2 = 3 * 2n, still a multiple of three).
ST1 -> ST2 ((3n+1) * 2 = 3*2n + 2, a multiple of three, plus 2).
ST2 -> ST1 ((3n+2) * 2 = 3*2n + 4 = 3*(2n+1) + 1, a multiple of three, plus 1).
Also think of any non-negative number, multiply it by two then add one (tack a binary one on to the end). The transitions for that are:
ST0 -> ST1 (3n * 2 + 1 = 3*2n + 1, a multiple of three, plus 1).
ST1 -> ST0 ((3n+1) * 2 + 1 = 3*2n + 2 + 1 = 3*(2n+1), a multiple of three).
ST2 -> ST2 ((3n+2) * 2 + 1 = 3*2n + 4 + 1 = 3*(2n+1) + 2, a multiple of three, plus 2).
This idea is that, at the end, you need to finish up in state ST0. However, given that there can be an arbitrary number of sub-expressions (and sub-sub-expressions), it does not lend itself easily to reduction to a regular expression.
What you have to do is allow for any of the transition sequences that can get from ST0 to ST0 then just repeat them:
These boil down to the two RE sequences:
ST0 --> ST0 : 0+
[0]
ST0 --> ST1 (--> ST2 (--> ST2)* --> ST1)* --> ST0: 1(01*0)*1
[1] ([0] ([1] )* [0] )* [1]
or the regex:
(0+|1(01*0)*1)+
This captures the multiples of three, or at least the first ten that I tested. You can try as many as you like, they'll all work, that's the beauty of mathematical analysis rather than anecdotal evidence.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With