Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should I not use regular expressions?

Tags:

regex

After some research I figured that it is not possible to parse recursive structures (such as HTML or XML) using regular expressions. Is it possible to comprehensively list out day to day coding scenarios where I should avoid using regular expressions because it is just impossible to do that particular task using regular expressions? Let us say the regex engine in question is not PCRE.

like image 765
Narendra Yadala Avatar asked Sep 26 '11 10:09

Narendra Yadala


People also ask

What can regular expressions not do?

In short regular expressions does not allow the pattern to refer to itself. You cannot say: at this point in the syntax match the whole pattern again. To put it another way, regular expressions only matches linearly, it does not contain a stack which would allow it to keep track of how deep it is an a nested pattern.

Is it good to use regular expression?

Regular expressions are useful in search and replace operations. The typical use case is to look for a sub-string that matches a pattern and replace it with something else. Most APIs using regular expressions allow you to reference capture groups from the search pattern in the replacement string.


1 Answers

Don't use regular expressions when:

  • the language you are trying to parse is not a regular language, or
  • when there are readily available parsers specifically made for the data you are trying to parse.

Parsing HTML and XML with regular expressions is usually a bad idea both because they are not regular languages and because libraries already exist that can parse it for you.

As another example, if you need to check if an integer is in the range 0-255, it's easier to understand if you use your language's library functions to parse it to an integer and then check its numeric value instead of trying to write the regular expression that matches this range.

like image 110
Mark Byers Avatar answered Sep 24 '22 10:09

Mark Byers