Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing intersection of two regular languages

I want to test whether two languages have a string in common. Both of these languages are from a subset of regular languages described below and I only need to know whether there exists a string in both languages, not produce an example string.

The language is specified by a glob-like string like

/foo/**/bar/*.baz

where ** matches 0 or more characters, and * matches zero or more characters that are not /, and all other characters are literal.

Any ideas?

thanks, mike

EDIT:

I implemented something that seems to perform well, but have yet to try a correctness proof. You can see the source and unit tests

like image 876
Mike Samuel Avatar asked Feb 26 '10 00:02

Mike Samuel


People also ask

Is intersection of 2 regular languages regular?

No, the intersection of two regular languages is guaranteed to be a regular language. This can be proved a lot of ways, but an easy way is to use closure properties.

What is the intersection of two regular languages?

Intersection. Theorem If L1 and L2 are regular languages, then the new language L = L1 ∩ L2 is regular.

How do you find the intersection of two regular expressions?

The intersection of two regular set is regular. RE (L1 ∩ L2) = aa(aa)* which is a regular expression itself.

Is the intersection of two regular languages context-free?

1.5 Intersection with a regular languageThe intersection of a context-free language and a regular language is context- free (Theorem 3.5. 2). The idea of the proof is to simulate a push-down automaton and a finite state automaton in parallel and only accept if both machines accept.


2 Answers

Build FAs A and B for both languages, and construct the "intersection FA" AnB. If AnB has at least one accepting state accessible from the start state, then there is a word that is in both languages.

Constructing AnB could be tricky, but I'm sure there are FA textbooks that cover it. The approach I would take is:

  • The states of AnB is the cartesian product of the states of A and B respectively. A state in AnB is written (a, b) where a is a state in A and b is a state in B.
  • A transition (a, b) ->r (c, d) (meaning, there is a transition from (a, b) to (c, d) on symbol r) exists iff a ->r c is a transition in A, and b ->r d is a transition in B.
  • (a, b) is a start state in AnB iff a and b are start states in A and B respectively.
  • (a, b) is an accepting state in AnB iff each is an accepting state in its respective FA.

This is all off the top of my head, and hence completely unproven!

like image 67
Edmund Avatar answered Oct 11 '22 22:10

Edmund


I just did a quick search and this problem is decidable (aka can be done), but I don't know of any good algorithms to do it. One is solution is:

  1. Convert both regular expressions to NFAs A and B
  2. Create a NFA, C, that represents the intersection of A and B.
  3. Now try every string from 0 to the number of states in C and see if C accepts it (since if the string is longer it must repeat states at one point).

I know this might be a little hard to follow but this is only way I know how.

like image 38
Bishnu Avatar answered Oct 11 '22 20:10

Bishnu