Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are regex tools (like RegexBuddy) a good idea?

Tags:

regex

One of my developers has started using RegexBuddy for help in interpreting legacy code, which is a usage I fully understand and support. What concerns me is using a regex tool for writing new code. I have actually discouraged its use for new code in my team. Two quotes come to mind:

Some people, when confronted with a problem, think "I know, I’ll use regular expressions." Now they have two problems. - Jamie Zawinski

And:

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian Kernighan

My concerns are (respectively:)

  • That the tool may make it possible to solve a problem using a complicated regular expression that really doesn't need it. (See also this question).

  • That my one developer, using regex tools, will start writing regular expressions which (even with comments) can't be maintained by anyone who doesn't have (and know how to use) regex tools.

Should I encourage or discourage the use of regex tools, specifically with regard to producing new code? Are my concerns justified? Or am I being paranoid?

like image 619
Adam Bellaire Avatar asked Oct 23 '08 16:10

Adam Bellaire


People also ask

Is regex still useful?

A regex, regular expression, is a very powerful tool in programming that is frequently used. Using a regex a character pattern can be defined and later used to find this pattern inside a string of characters. There are multiple tasks for which regex is useful.

Is regex learning worth it?

It is powerful and useful if you are doing a lot of text processing. Like a programming language, it will take time and practice to get comfortable with it. There are various online tools that can help you with learning, writing and debugging regex.

How efficient is regex?

Regular Expressions are efficient in that one line of code can save you writing hundreds of lines. But they're normally slower (even pre-compiled) than thoughtful hand written code simply due to the overhead. Generally the simpler the objective the worse Regular Expressions are. They're better for complex operations.

Should I use regex in code?

Regular expressions are very useful not only for pattern matching, but also for manipulating text. In SRMs regular expressions can be extremely handy. Many problems that require some coding can be written using regular expressions on a few lines, making your life much easier.


2 Answers

Poor programming is rarely the fault of the tool. It is the fault of the developer not understanding the tool. To me, this is like saying a carpenter should not own a screwdriver because he might use a screw where a nail would have been more appropriate.

like image 63
EBGreen Avatar answered Oct 09 '22 04:10

EBGreen


Regular expressions are just one of the many tools available to you. I don't generally agree with the oft-cited Zawinski quote, as with any technology or technique, there are both good and bad ways to apply them.

Personally, I see things like RegexBuddy and the free Regex Coach primarily as learning tools. There are certainly times when they can be helpful to debug or understand existing regexes, but generally speaking, if you've written your regex using a tool, then it's going to be very hard to maintain it.

As a Perl programmer, I'm very familiar with both good and bad regular expressions, and have been using even complicated ones in production code successfully for many years. Here are a few of the guidelines I like to stick to that have been gathered from various places:

  • Don't use a regex when a string match will do. I often see code where people use regular expressions in order to match a string case-insensitively. Simply lower- or upper-case the string and perform a standard string comparison.
  • Don't use a regex to see if a string is one of several possible values. This is unnecessarily hard to maintain. Instead place the possible values in an array, hash (whatever your language provides) and test the string against those.
  • Write tests! Having a set of tests that specifically target your regular expression makes development significantly easier, particularly if it's a vaguely complicated one. Plus, a few tests can often answer many of the questions a maintenance programmer is likely to have about your regex.
  • Construct your regex out of smaller parts. If you really need a big complicated regex, build it out of smaller, testable sections. This not only makes development easier (as you can get each smaller section right individually), but it also makes the code more readable, flexible and allows for thorough commenting.
  • Build your regular expression into a dedicated subroutine/function/method. This makes it very easy to write tests for the regex (and only the regex). it also makes the code in which your regex is used easier to read (a nicely named function call is considerably less scary than a block of random punctuation!). Dropping huge regular expressions into the middle of a block of code (where they can't easily be tested in isolation) is extremely common, and usually very easy to avoid.
like image 39
Dan Avatar answered Oct 09 '22 02:10

Dan