Given two regular expressions, is it possible to detect whether there is any possible string that matches them both?
For example, given regexes A
and .
, I can see that string "A"
matches them both. That's a simple case.
My question is for the broader case -- given any two valid regexes, would it be possible to definitively say whether there is any possible string that would match both regexes? Assume that there is no sample set of input strings to test. All I have are the regexes. I don't necessarily need to produce matching strings -- I just need to determine that there are possible strings that match both.
Will accept discussions for any of the common regex specifications -- .NET, Java, PERL, sed, grep, etc.
We say that two regular expressions R and S are equivalent if they describe the same language. In other words, if L(R) = L(S) for two regular expressions R and S then R = S. Examples.
The backslash character (\) in a regular expression indicates that the character that follows it either is a special character (as shown in the following table), or should be interpreted literally. For more information, see Character Escapes. Escaped character. Description. Pattern.
On the other hand, the \S+ (uppercase S ) matches anything that is NOT matched by \s , i.e., non-whitespace. In regex, the uppercase metacharacter denotes the inverse of the lowercase counterpart, for example, \w for word character and \W for non-word character; \d for digit and \D or non-digit.
Multiline option, it matches either the newline character ( \n ) or the end of the input string. It does not, however, match the carriage return/line feed character combination.
Basically, you want to test if the intersection of two RegExps is non-empty. Since intersection - just like complement - is a potentially expensive operation (it requires determinization of the NFA), it is not implemented in many RegExp implementations. One exception I know of is the BRICS Automaton Library, which allows enabling the intersection operator &
.
To test the property in question, you could use the BRICS (Java) library like this:
RegExp re = new RegExp("(.) & (a)", RegExp.INTERSECTION); // Parse RegExp
Automaton a = re.toAutomaton(); // convert RegExp to automaton
if(a.isEmpty()) { // Test if intersection is empty
System.out.println("Intersection is empty!");
}
else {
// Print the shortest accepted string
System.out.println("Intersection is non-empty, example: " + a.getShortestExample(true));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With