Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Library to check if two regular expressions are equal/isomorphic [closed]

Tags:

I need a library which will take in two regular expressions and determine whether they are isomorphic (i.e. match exactly the same set of strings or not) For example a|b is isomorphic to [ab]

As I understand it, a regular expression can be converted to an NFA which in some cases can be efficiently converted to a DFA. The DFA can then be converted to a minimal DFA, which, if I understand it correctly, is unique and so these minimal DFA's can then be compared for equality. I realize that not all regular expression NFA's can be efficently transformed into DFA's (especially when they were generate from Perl Regexps which are not truly "regular") in which case ideally the library would just return an error or some other indication that the conversion is not possible.

I see tons of articles and academic papers on-line about doing this (and even some programming assignments for classes asking students to do this) but I can't seem to find a library which implements this functionality. I would prefer a Python and/or C/C++ library, but a library in any language will do. Does anyone know if such a library? If not, does someone know of a library that gets close that I can use as a starting point?

like image 868
user1255384 Avatar asked Mar 07 '12 18:03

user1255384


People also ask

How do you know if two regular expressions are equal?

We say that two regular expressions R and S are equivalent if they describe the same language. In other words, if L(R) = L(S) for two regular expressions R and S then R = S. Examples. Are R and S equivalent?

What does '*' represent in regular expression?

*$ means - match, from beginning to end, any character that appears zero or more times. Basically, that means - match everything from start to end of the string. This regex pattern is not very useful. Let's take a regex pattern that may be a bit useful.

What is regular expression parsing?

Regular expression parsing makes finding matches of a regular expression even more useful by allowing us to directly extract subpatterns of the match, e.g., for extracting IP-addresses from internet traffic analysis or extracting subparts of genomes from genetic data bases.

How do you identify a regular expression?

A regular expression (sometimes called a rational expression) is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. “find and replace”-like operations.


2 Answers

Haven't tried it, but Regexp:Compare for Perl looks promising: two regex's are equivalent if the language of the first is a subset of the second, and vice verse.

like image 56
user1071136 Avatar answered Oct 20 '22 19:10

user1071136


The brics automaton library for Java supports this. It can be used to convert regular expressions to minimal Deterministic Finite State Automata, and check if these are equivalent:

public static void isIsomorphic(String regexA, String regexB) {     Automaton a = new RegExp(regexA).toAutomaton();     Automaton b = new RegExp(regexB).toAutomaton();     return a.equals(b); } 

Note that this library only works for regular expressions that describe a regular language: it does not support some more advanced features, such as backreferences.

like image 23
ChrisBlom Avatar answered Oct 20 '22 18:10

ChrisBlom