Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to see if regex repeat is reducible

I'm looking for an algorithm that can check if nested regex repeats are reducible. Assume that parsing the regex is already done.

Example:

(1{1,2}){1,2} === 1{1,4}
    It matches 1, 11, 111, 1111 which can be rewritten as a single repeat

(1{2,2}){1,2} can not be reduced
    It matches 11 and 1111 which can not be rewritten as a single repeat.

(1{10,11}){1,2} can not be reduced

(1{10,19}){1,2} === 1{10,38}

(1{1,2}){10,11} === 1{10,22}

(1{10,11})* can not be reduced

(1*){10,11} === 1*

I've been trying to find a pattern to this type of operation without having to match all possible solutions and look for holes that would prevent it from being reduced. There must be a simple function (f( A, B, C, D ) -> ( E, F )) that could solve an arbitrary input like this:

(1{A,B}){C,D} -> 1{E,F}
like image 814
Kendall Hopkins Avatar asked Jun 25 '12 02:06

Kendall Hopkins


1 Answers

// (x{A,B}){C,D} -> x{E,F}
bool SimplifyNestedRepetition(int A, int B,
                              int C, int D,
                              out int E, out int F)
{
    if (B == -1 || C == D || A*(C+1) <= B*C + 1)
    {
        E = A*C;

        if (B == -1 || D == -1) F = -1;
        else F = B*D;

        return true;
    }
    return false;
}
  • If x{A,B} is unlimited, it can be repeated any number of times.
  • (x{A,B}){C} is always reducible.
  • If A*(C+1) <= B*C + 1, you can reduce it, since there are no gap between the longest sequence of C repetitions, and the shortest sequence of C+1 repetitions.

B = -1 or D == -1 means unlimited, like x* or x{5,}.


Test cases:

Input           Reducible?
(x{0,0}){0,0}   Yes - x{0,0}
(x{0,1}){0,0}   Yes - x{0,0}
(x{0,2}){0,0}   Yes - x{0,0}
(x{1,1}){0,0}   Yes - x{0,0}
(x{1,2}){0,0}   Yes - x{0,0}
(x{1,3}){0,0}   Yes - x{0,0}
(x{2,2}){0,0}   Yes - x{0,0}
(x{2,3}){0,0}   Yes - x{0,0}
(x{2,4}){0,0}   Yes - x{0,0}
(x{0,0}){0,1}   Yes - x{0,0}
(x{0,1}){0,1}   Yes - x{0,1}
(x{0,2}){0,1}   Yes - x{0,2}
(x{1,1}){0,1}   Yes - x{0,1}
(x{1,2}){0,1}   Yes - x{0,2}
(x{1,3}){0,1}   Yes - x{0,3}
(x{2,2}){0,1}   No 
(x{2,3}){0,1}   No 
(x{2,4}){0,1}   No 
(x{0,0}){0,2}   Yes - x{0,0}
(x{0,1}){0,2}   Yes - x{0,2}
(x{0,2}){0,2}   Yes - x{0,4}
(x{1,1}){0,2}   Yes - x{0,2}
(x{1,2}){0,2}   Yes - x{0,4}
(x{1,3}){0,2}   Yes - x{0,6}
(x{2,2}){0,2}   No 
(x{2,3}){0,2}   No 
(x{2,4}){0,2}   No 
(x{0,0}){1,1}   Yes - x{0,0}
(x{0,1}){1,1}   Yes - x{0,1}
(x{0,2}){1,1}   Yes - x{0,2}
(x{1,1}){1,1}   Yes - x{1,1}
(x{1,2}){1,1}   Yes - x{1,2}
(x{1,3}){1,1}   Yes - x{1,3}
(x{2,2}){1,1}   Yes - x{2,2}
(x{2,3}){1,1}   Yes - x{2,3}
(x{2,4}){1,1}   Yes - x{2,4}
(x{0,0}){1,2}   Yes - x{0,0}
(x{0,1}){1,2}   Yes - x{0,2}
(x{0,2}){1,2}   Yes - x{0,4}
(x{1,1}){1,2}   Yes - x{1,2}
(x{1,2}){1,2}   Yes - x{1,4}
(x{1,3}){1,2}   Yes - x{1,6}
(x{2,2}){1,2}   No 
(x{2,3}){1,2}   Yes - x{2,6}
(x{2,4}){1,2}   Yes - x{2,8}
(x{0,0}){1,3}   Yes - x{0,0}
(x{0,1}){1,3}   Yes - x{0,3}
(x{0,2}){1,3}   Yes - x{0,6}
(x{1,1}){1,3}   Yes - x{1,3}
(x{1,2}){1,3}   Yes - x{1,6}
(x{1,3}){1,3}   Yes - x{1,9}
(x{2,2}){1,3}   No 
(x{2,3}){1,3}   Yes - x{2,9}
(x{2,4}){1,3}   Yes - x{2,12}
(x{0,0}){2,2}   Yes - x{0,0}
(x{0,1}){2,2}   Yes - x{0,2}
(x{0,2}){2,2}   Yes - x{0,4}
(x{1,1}){2,2}   Yes - x{2,2}
(x{1,2}){2,2}   Yes - x{2,4}
(x{1,3}){2,2}   Yes - x{2,6}
(x{2,2}){2,2}   Yes - x{4,4}
(x{2,3}){2,2}   Yes - x{4,6}
(x{2,4}){2,2}   Yes - x{4,8}
(x{0,0}){2,3}   Yes - x{0,0}
(x{0,1}){2,3}   Yes - x{0,3}
(x{0,2}){2,3}   Yes - x{0,6}
(x{1,1}){2,3}   Yes - x{2,3}
(x{1,2}){2,3}   Yes - x{2,6}
(x{1,3}){2,3}   Yes - x{2,9}
(x{2,2}){2,3}   No 
(x{2,3}){2,3}   Yes - x{4,9}
(x{2,4}){2,3}   Yes - x{4,12}
(x{0,0}){2,4}   Yes - x{0,0}
(x{0,1}){2,4}   Yes - x{0,4}
(x{0,2}){2,4}   Yes - x{0,8}
(x{1,1}){2,4}   Yes - x{2,4}
(x{1,2}){2,4}   Yes - x{2,8}
(x{1,3}){2,4}   Yes - x{2,12}
(x{2,2}){2,4}   No 
(x{2,3}){2,4}   Yes - x{4,12}
(x{2,4}){2,4}   Yes - x{4,16}
like image 69
Markus Jarderot Avatar answered Sep 22 '22 05:09

Markus Jarderot