Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you debug a regex? [closed]

People also ask

When should you not use regex?

Regex isn't suited to parse HTML because HTML isn't a regular language. Regex probably won't be the tool to reach for when parsing source code. There are better tools to create tokenized outputs. I would avoid parsing a URL's path and query parameters with regex.

What are regex commands?

A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.

How do you evaluate in regex?

To test a regular expression, first search for errors such as non-escaped characters or unbalanced parentheses. Then test it against various input strings to ensure it accepts correct strings and regex wrong ones. A regex tester tool is a great tool that does all of this.

Does regex affect performance?

Being more specific with your regular expressions, even if they become much longer, can make a world of difference in performance. The fewer characters you scan to determine the match, the faster your regexes will be.


You buy RegexBuddy and use its built in debug feature. If you work with regexes more than twice a year, you will make this money back in time saved in no time. RegexBuddy will also help you to create simple and complex regular expressions, and even generate the code for you in a variety of languages.

alt text

Also, according to the developer, this tool runs nearly flawlessly on Linux when used with WINE.


With Perl 5.10, use re 'debug';. (Or debugcolor, but I can't format the output properly on Stack Overflow.)

$ perl -Mre=debug -e'"foobar"=~/(.)\1/'
Compiling REx "(.)\1"
Final program:
   1: OPEN1 (3)
   3:   REG_ANY (4)
   4: CLOSE1 (6)
   6: REF1 (8)
   8: END (0)
minlen 1
Matching REx "(.)\1" against "foobar"
   0 <> <foobar>             |  1:OPEN1(3)
   0 <> <foobar>             |  3:REG_ANY(4)
   1 <f> <oobar>             |  4:CLOSE1(6)
   1 <f> <oobar>             |  6:REF1(8)
                                  failed...
   1 <f> <oobar>             |  1:OPEN1(3)
   1 <f> <oobar>             |  3:REG_ANY(4)
   2 <fo> <obar>             |  4:CLOSE1(6)
   2 <fo> <obar>             |  6:REF1(8)
   3 <foo> <bar>             |  8:END(0)
Match successful!
Freeing REx: "(.)\1"

Also, you can add whitespace and comments to regexes to make them more readable. In Perl, this is done with the /x modifier. With pcre, there is the PCRE_EXTENDED flag.

"foobar" =~ /
    (.)  # any character, followed by a
    \1   # repeat of previously matched character
/x;

pcre *pat = pcre_compile("(.)  # any character, followed by a\n"
                         "\\1  # repeat of previously matched character\n",
                         PCRE_EXTENDED,
                         ...);
pcre_exec(pat, NULL, "foobar", ...);

I'll add another so that I don't forget it : debuggex

It's good because it's very visual: Photo of the Debuggex regex helper


When I get stuck on a regex I usually turn to this: https://regexr.com/

Its perfect for quickly testing where something is going wrong.