Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex VS Contains. Best Performance? [closed]

Tags:

java

regex

I want to compare an URI String over different patterns in java and I want fastest code possible.

Should I use :

if(uri.contains("/br/fab") || uri.contains("/br/err") || uri.contains("/br/sts") 

Or something like :

if(uri.matches(".*/br/(fab|err|sts).*")) 

Note that I can have a lot more uri and this method is called very often.

What is the best answer between my choices ?

like image 228
Mike Avatar asked Jan 07 '10 21:01

Mike


People also ask

Is there anything faster than regex?

String operations will always be faster than regular expression operations. Unless, of course, you write the string operations in an inefficient way. Regular expressions have to be parsed, and code generated to perform the operation using string operations.

Is regex fast or slow?

The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.

Is regex faster than for loop?

Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement.

How fast is regex matching?

The bad regular expression took on average 10,100 milliseconds to process all 1,000,000 lines, while the good regular expression took just 240 milliseconds.


1 Answers

If you're going to use a regular expression, create it up-front and reuse the same Pattern object:

private static final Pattern pattern = Pattern.compile(".*/br/(fab|err|sts).*"); 

Do you actually need the ".*" at each end? I wouldn't expect it to be required, if you use Matcher.find().

Which is faster? The easiest way to find out is to measure it against some sample data - with as realistic samples as possible. (The fastest solution may very well depend on

Are you already sure this is a bottleneck though? If you've already measured the code enough to find out that it's a bottleneck, I'm surprised you haven't just tried both already. If you haven't verified that it's a problem, that's the first thing to do before worrying about the "fastest code possible".

If it's not a bottleneck, I would personally opt for the non-regex version unless you're a regex junkie. Regular expressions are very powerful, but also very easy to get wrong.

like image 73
Jon Skeet Avatar answered Sep 19 '22 16:09

Jon Skeet