Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java regular expressions: performance and alternative

Tags:

java

regex

Recently I have been had to search a number of string values to see which one matches a certain pattern. Neither the number of string values nor the pattern itself is clear until a search term has been entered by the user. The problem is I have noticed each time my application runs the following line:

    if (stringValue.matches (rexExPattern))     {         // do something so simple     } 

it takes about 40 micro second. No need to say when the number of string values exceeds a few thousands, it'll be too slow.

The pattern is something like:

    "A*B*C*D*E*F*" 

where A~F are just examples here, but the pattern is some thing like the above. Please note* that the pattern actually changes per search. For example "A*B*C*" may change to W*D*G*A*".

I wonder if there is a better substitution for the above pattern or, more generally, an alternative for java regular expressions.

like image 715
Joseph_Marzbani Avatar asked Nov 07 '13 07:11

Joseph_Marzbani


People also ask

Is regex faster in Java?

Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement. If you are using regex as to match very small text and small pattern and don't do it because the matcher function .

Is regex slow in Java?

it takes about 40 micro second. No need to say when the number of string values exceeds a few thousands, it'll be too slow.

Is regex bad for performance?

In General, the Longer Regex Is the Better Regex Good regular expressions are often longer than bad regular expressions because they make use of specific characters/character classes and have more structure. This causes good regular expressions to run faster as they predict their input more accurately.


1 Answers

Regular expressions in Java are compiled into an internal data structure. This compilation is the time-consuming process. Each time you invoke the method String.matches(String regex), the specified regular expression is compiled again.

So you should compile your regular expression only once and reuse it:

Pattern pattern = Pattern.compile(regexPattern); for(String value : values) {     Matcher matcher = pattern.matcher(value);     if (matcher.matches()) {         // your code here     } } 
like image 75
Seelenvirtuose Avatar answered Sep 23 '22 23:09

Seelenvirtuose