Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to find an algorithm which takes 2 regular expressions and tells whether they are equivalent

I'm trying to find out what the algorithm would be by being given two languages L1 and L2 to determine if they are equivalent (L1 = L2).

It's surprisingly difficult to come up with one as I've found, although I am pretty sure it needs to be converted to a DFA first and then reduce each of them to a minimal DFA..

Also, I know that if L1 - L2 and L2 - L1 are empty, then L1 = L2.

Anyone good with theory here?

like image 344
John Avatar asked Oct 14 '10 07:10

John


People also ask

How do you show that two regular expressions are equivalent?

We say that two regular expressions R and S are equivalent if they describe the same language. In other words, if L(R) = L(S) for two regular expressions R and S then R = S. Examples.

What does 2 mean in regular expression?

2. It thus means that the character in the second group is repeated. So this matches string where a character is repeated two times (or more). – Willem Van Onsem.

How do you compare regular expressions?

The regular expression comparison is performed on the string representation of the left side of the comparison. That is, if the left side is an integer, the regular expression will behave is if the value 0 was the literal string "0" .

What can be matched using (*) in a regular expression?

A regular expression followed by an asterisk ( * ) matches zero or more occurrences of the regular expression. If there is any choice, the first matching string in a line is used.


2 Answers

You can find a description of a reasonably efficient algorithm for testing r.e. equality here:

http://arxiv.org/PS_cache/arxiv/pdf/0907/0907.5058v1.pdf

Dig through references of the article to find other solutions that may be less efficient, but easier to implement.

like image 86
Gintautas Miliauskas Avatar answered Oct 05 '22 19:10

Gintautas Miliauskas


Here's a conceptually simple answer (assuming L1 and L2 are regular).

1) Find DFAs D1 and D2 for L1 and L2 respectively.

2) Construct D'1 and D'2 from D1 and D2 by swapping accepting/non-accepting states (note that D'1 accepts exactly ~L1 and D'2 accepts ~L2 where ~ means "complement of")

3) Use the standard product construction three times to produce a DFA that accepts exactly (L1 intersect ~L2) union (L2 intersect ~L1)

4) Check to see if the DFA from part 3 accepts any strings by checking each accepting state for reachability from the start state.

5) If the DFA from part 3 accepts any strings, then L1 <> L2. Otherwise, L1=L2

There are a huge number of heuristics you could use to speed this up, but conceptually, this is probably the simplest algorithm. A good reference for the product construction in part 3 is "Automata and Computability" by Dexter Kozen.

like image 38
Aubrey da Cunha Avatar answered Oct 05 '22 20:10

Aubrey da Cunha