I have a large collection of regular expression that when matched call a particular http handler. Some of the older regex's are unreachable (e.g. <code>a.c* ⊃ abc*</code>) and I'd like to prune them. Is there a library that given two regex's will tell me if the second is subset of the first? I wasn't sure this was decidable at first (it smelled like the halting problem by a different name). But it turns out it's decidable.

Trying to find the complexity of this problem lead me to this paper. The formal definition of the problem can be found within: this is generally called the inclusion problem <blockquote> The inclusion problem for R, is to test for two given expressions r, r′ ∈ R, whether r &sube; r′. </blockquote> That paper has some great information (summary: all but the simplest expressions are fairly complex), however searching for information on the inclusion problem leads one directly back to StackOverflow. That answer already had a link to a paper describing a passable polynomial time algorithm which should cover a lot of common cases.

Determining whether a regex is a subset of another

1 Answers

Trying to find the complexity of this problem lead me to this paper.

The formal definition of the problem can be found within: this is generally called the inclusion problem

The inclusion problem for R, is to test for two given expressions r, r′ ∈ R, whether r ⊆ r′.

That paper has some great information (summary: all but the simplest expressions are fairly complex), however searching for information on the inclusion problem leads one directly back to StackOverflow. That answer already had a link to a paper describing a passable polynomial time algorithm which should cover a lot of common cases.

196

answered Oct 02 '22 18:10

Kevin Stricker

Related questions
                            
                                Include the "minus-sign" into this regular expression, how?
                            
                                How do I re.search or re.match on a whole file without reading it all into memory?
                            
                                Add a character to the beginning and end of a string
                            
                                How can I replace every instance of a pattern in ruby?
                            
                                How to replace backward slash to forward slash using java?
                            
                                Regular expression pattern to match URL with or without http://www
                            
                                How can you return everything after last slash(/) in a Ruby string [closed]
                            
                                JavaScript regex with escaped slashes does not replace
                            
                                PHP/regex: How to get the string value of HTML tag?
                            
                                C# Regex to match a string that doesn't contain a certain string?
                            
                                Php put a space in front of capitals in a string (Regex)
                            
                                Difference between \r and \n
                            
                                How to remove duplicate white spaces in a string? [duplicate]
                            
                                Add http(s) to URL if it's not there?
                            
                                How to remove empty lines from a formatted string
                            
                                Change foreign characters to their roman equivalent
                            
                                Using regular expressions to validate a numeric range
                            
                                How to pull the file name from a url using javascript/jquery?
                            
                                numbers not allowed (0-9) - Regex Expression in javascript
                            
                                Regex for password PHP [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Determining whether a regex is a subset of another

Tags:

regex

regular-language

halting-problem

deft_code

People also ask

1 Answers

Kevin Stricker

Recent Activity

Donate For Us