Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is it wise to use regular expressions with HTML? [closed]

While it's absolutely true that regexp are not the right tool to fully parse HTML documents, I am seeing a lot of people blindly disregarding any question about regexp if they as much as see a single HTML tag in the proposed text.

Since we see a lot of examples of regexp not being the right tool, I ask your opinion on this: what are the cases where a simple pattern match is a better solution than using a full parsing engine?

like image 212
Matteo Riva Avatar asked Nov 29 '09 18:11

Matteo Riva


People also ask

Is regex still useful?

Regular expressions are useful in search and replace operations. The typical use case is to look for a sub-string that matches a pattern and replace it with something else. Most APIs using regular expressions allow you to reference capture groups from the search pattern in the replacement string.

What are regular expressions How are they useful?

A regular expression (also called regex or regexp) is a way to describe a pattern. It is used to locate or validate specific strings or patterns of text in a sentence, document, or any other character input. Regular expressions use both basic and special characters.

Are regular expression better then string functions How?

String operations will always be faster than regular expression operations. Unless, of course, you write the string operations in an inefficient way. Regular expressions have to be parsed, and code generated to perform the operation using string operations.


1 Answers

If the set of HTML you're looking to parse with a regexp is known to conform to some sort of pattern. e.g. if you know there's no commented-out HTML, or complex scenarios etc.

e.g. I often preach that you shouldn't use regexps for HTML, but if I have a set of HTML that I'm familiar with, is straightforward and that I can check easily post-manipulation, then I have no qualms about using a regexp for that.

like image 82
Brian Agnew Avatar answered Oct 24 '22 06:10

Brian Agnew