Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking if two Python regex patterns are equivalent

Tags:

python

regex

I want to write a regex in re.VERBOSE mode, but I'm not confident that I'll add the verbose part without error.

I remember that, theoretically, the equivalence of two regexes (without backreferences, at least) can be found by generating their automata and trying to find a graph bijection. But there's no instance method I can see for comparing regexes.

Is there a way to either generate the automaton of a regex or directly compare them, preferably with the standard library?

(I've already decided on a different solution to my problem, but this is still of interest to me.)

like image 271
leewz Avatar asked Jan 28 '14 06:01

leewz


1 Answers

You can use the undocumented re.DEBUG feature:

>>> r1 = re.compile("foo[bar]baz", re.DEBUG)
literal 102
literal 111
literal 111
in
  literal 98
  literal 97
  literal 114
literal 98
literal 97
literal 122
>>> r2 = re.compile("""foo   # foo!
...                    [bar] # b or a or r!
...                    baz   # baz!""", re.VERBOSE|re.DEBUG)
literal 102
literal 111
literal 111
in
  literal 98
  literal 97
  literal 114
literal 98
literal 97
literal 122

If the output is identical, r1 and r2 are identical as well.

like image 86
Tim Pietzcker Avatar answered Oct 11 '22 17:10

Tim Pietzcker