Some say I should use regex whenever possible, others say I should use it at least as possible. Is there something like a "Perl Etiquette" about that matter or just TIMTOWTDI?
The level of complexity generally dictates whether I use a regex or not. Some of the questions I ask when deciding whether or not to use a regex are:
- Is there no built string function that handles this relatively easily?
- Do I need to capture substring groups?
- Do I need complex features like look behind or negative sets?
- Am I going to make use of character sets?
- Will using a regex make my code more readable?
If I answer yes to any of these, I generally use a regex.
I think a lot of the answers you got already are good. I want to address the etiquette part because I think there is some.
Summed up: if there is a robust parser available, use it instead of regular expressions; 100% of the time. Never recommend anything else to a novice. So–
Don'ts
- Don't split or match against commas for CSV, use Text::CSV/Text::CSV_XS.
- Don't write regexes against HTML or XML, use XML::LibXML, XML::Twig, HTML::TreeBuilder, HTML::TokeParser::Simple, et cetera.
- Don't write regexes for things that are trivial to split or unpack.
Dos
- Do use substr, index, and rindex where appropriate but recognize they can come off "unperly" so they are best used when benchmarking shows them superior to regular expressions; regexes can be surprisingly fast in many cases.
- Do use regular expressions when there is no good parser available and writing a Parse::RecDescent grammar is overkill, too much work, or will be too slow.
- Do use regular expressions for throw-away code like one-liners on well-known/predictable data including the HTML/CSV previously banned from regular expression use.
- Do be aware of alternatives for bigger problems like P::RecD, Parse::Yapp, and Marpa.
-
Do keep your own council. Perl is supposed to be fun. Do whatever you like; just be prepared to get bashed if you complain when not following advice and it goes sideways. :P