I guess my question is best explained with an (simplified) example.
Regex 1:
^\d+_[a-z]+$
Regex 2:
^\d*$
Regex 1 will never match a string where regex 2 matches. So let's say that regex 1 is orthogonal to regex 2.
As many people asked what I meant by orthogonal I'll try to clarify it:
Let S1 be the (infinite) set of strings where regex 1 matches. S2 is the set of strings where regex 2 matches. Regex 2 is orthogonal to regex 1 iff the intersection of S1 and S2 is empty. The regex ^\d_a$ would be not orthogonal as the string '2_a' is in the set S1 and S2.
How can it be programmatically determined, if two regexes are orthogonal to each other?
Best case would be some library that implements a method like:
/**
* @return True if the regex is orthogonal (i.e. "intersection is empty"), False otherwise or Null if it can't be determined
*/
public Boolean isRegexOrthogonal(Pattern regex1, Pattern regex2);
We say that two regular expressions R and S are equivalent if they describe the same language. In other words, if L(R) = L(S) for two regular expressions R and S then R = S.
Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.
PCRE tries to match Perl syntax and semantics as closely as it can. PCRE also supports some alternative regular expression syntax (which does not conflict with the Perl syntax) in order to provide some compatibility with regular expressions in Python, . NET, and Oniguruma.
By "Orthogonal" you mean "the intersection is the empty set" I take it?
I would construct the regular expression for the intersection, then convert to a regular grammar in normal form, and see if it's the empty language...
Then again, I'm a theorist...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With