Are there any tools that will take a particular regular expression and return the worst case scenario in terms of the number of operations required for a certain number of characters that the regular expression is matched against? So for example, given a <code>(f|a)oo.*[ ]baz</code>, how many steps might the engine possibly go though through to match 100 characters? I would also be interested if there is a tool that can take a bunch of text samples and show the average operations for each run. I realize this will depend a lot on the engine used and the implementation -- but I am ignorant as to how common this is. So if it is common for many languages (making my question too vague) I would be particularly interested in Perl and Python.

Regexbuddy's debugger shows how many steps engine would take to conclude match or not on a given sample. More information on catastrophic backtracking and debugging regular expressions. <img src="https://i.stack.imgur.com/5nGMC.png" alt="catastrophic backtracking shown in RegexBuddy"> PS: It is not free but they offer a 3-month money-back guarantee.

Worst Case Analysis for Regular Expressions

Tags:

python

regex

optimization

perl

analysis

Are there any tools that will take a particular regular expression and return the worst case scenario in terms of the number of operations required for a certain number of characters that the regular expression is matched against?

So for example, given a (f|a)oo.*[ ]baz, how many steps might the engine possibly go though through to match 100 characters?

I would also be interested if there is a tool that can take a bunch of text samples and show the average operations for each run.

I realize this will depend a lot on the engine used and the implementation -- but I am ignorant as to how common this is. So if it is common for many languages (making my question too vague) I would be particularly interested in Perl and Python.

578

asked Jan 19 '11 02:01

Kyle Brandt

2 Answers

Regexbuddy's debugger shows how many steps engine would take to conclude match or not on a given sample. More information on catastrophic backtracking and debugging regular expressions.

catastrophic backtracking shown in RegexBuddy

PS: It is not free but they offer a 3-month money-back guarantee.

110

answered Sep 28 '22 04:09

Himanshu

Note that it depends on the engine. While regex theory is based on straight automata theory, most of the engines are not strict translations of those theories. For this reason, for instance, some engines incur in exponential time while strict NFA processing would not.

answered Sep 28 '22 04:09

Daniel C. Sobral

Related questions
                            
                                Python psycopg2 not inserting into postgresql table
                            
                                Remove whitespace in Python using string.whitespace
                            
                                How do I increase the contrast of an image in Python OpenCV
                            
                                Concise vector adding in Python? [duplicate]
                            
                                Setting GOOGLE_APPLICATION_CREDENTIALS for BigQuery Python CLI
                            
                                Code for best fit straight line of a scatter plot in python
                            
                                Concatenation of many lists in Python [duplicate]
                            
                                Python reverse list
                            
                                Short (and useful) python snippets [closed]
                            
                                Printing variables in Python 3.4
                            
                                Anaconda not found in ZSh?
                            
                                Issues with installing python libraries on Windows : CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/anaconda/win-64
                            
                                python flask import error
                            
                                Need a fast way to count and sum an iterable in a single pass
                            
                                Counting repeated characters in a string in Python
                            
                                Google App Engine Remote API does not work from local client
                            
                                Why use SQLAlchemy? Is it very convinent for coding? [closed]
                            
                                Are Boto3 Resources and Clients Equivalent? When Use One or Other?
                            
                                Separate SQLAlchemy models by file in Flask [duplicate]
                            
                                How to create and open a jupyter notebook ipynb file directly from terminal

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With