Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TCL string match vs regexps

Is it right that we should avoid using regexp as it is slow. Instead we should use string operations. Are there cases that both can be used but regexp is better?

like image 756
Narek Avatar asked Sep 14 '11 05:09

Narek


1 Answers

You should use the appropriate tool for the job. That means, you should not avoid regex, you should use it when it is necessary.

If you are just searching for a fixed sequence of characters, use string operations.

If you are searching for a pattern, then use regular expressions.

Example

Search for the word "Foo". use string operations it will also find "Foobar", is this OK? NO, well then maybe search for "Foo ", but then it will not find "Foo," and "Foo."

With regex no problem, you can match for a word boundary /\mFoo\M/ and this regex will not be slow.

I think this negative image comes from special problems like catastrophic backtracking.

There has been a recent example (catastrophic-backtracking-shouldnt-be-happening-on-this-regex) where this behaviour was unexpected.

Conclusion

A regex has to be well designed, if it isn't then the performance can be catastrophic. But the same can also happen to your normal code if you use a bad algorithm.

For a small job it should nearly never be a problem to use a regex, if your task is bigger and has to be repeated often, do a benchmark.

From my own experience, I am analyzing really big text files (some hundred MB) and use regexes to find the rows I am interested in and I don't experience performance problems because of regex.

Here an interesting read about code optimization

like image 63
stema Avatar answered Nov 15 '22 09:11

stema