I've noticed a lot of little debates about when to use regex and when to use a built in string function like String.Replace() (.NET).
It seems a lot of people recommend always, always, always using regex whenever you deal with strings at all (besides just displaying them). Is this really best practice or just a wrong impression on my part? It seems like overkill to use regex when the problem is just "Remove any occurrence of any of these words from this text".
I'd like input so I can improve my own code and to better answer other people's questions about string manipulation (there's a lot of them).
I think it's a wrong impression to use Regex as a catch-all solution when string based search/replace is possible.
Regex is instrinsically a process of pattern matching and should be used when the types of strings you want to match are variable or only conform to a particular pattern. For cases when a simple string search would suffice, I would always recommend using the in-built methods of the String
class.
I have never seen any performance statistics suggesting that a Regex based lookup is faster or more performant than string indexing. Additionally, Regex engines vary in their execution capabilities.
As if that were not enough, it is quite easy to construct a Regex that performs quite badly (uses a lot of backtracking, for instance) so deep knowledge of Regex is required if you really want to optimize performance using Regex matching. On the other hand, it is quite simple even for a n00b to perform string based searches or replacements.
Regex.Replace() is much more expensive than the String.Replace() method. Use String.Replace() when possible, and use Regex when it's a necessity.
Take a look at this benchmark to see the time differences.
I just love regexes but if there is a simple xxx->replace("foo","bar") type function available it seems silly to use a power tool like regex when a simple screwdriver would do.
If performance is an issue then regex can be very cpu consuming for simple substitutions. (Regex usually works out more efficient on a complex search/transform than a series of "simpler" calls).
Also I get continually caught out by the "minor" implementation differences -- like Pythons implied "^...$" on the match() builtin. I was on the road with no internet access at the time and ended up buying another copy of Lutz's book to find out what was going on!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With