Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex in java and its performance compared to indexOf

Please can someone tell me how to match "_" and a period "." excatly one time in a string using regex, Also is it more efficient using indexOf() instead of regex expression.

String s= "Hello_Wor.ld"  or 
s="12323_!£££$.asdfasd"

bascially any no of characters can come before and after _ and . the only requirement is that the entire string should only contain one occurance of _ and .

like image 816
George Avatar asked Nov 16 '11 19:11

George


2 Answers

indexOf will be much quicker than a regex, and will probably also be easier to understand.

Just test if indexOf('_') >= 0, and then if indexOf('_', indexOfFirstUnderScore) < 0. Do the same for the period.

private boolean containsOneAndOnlyOne(String s, char c) {
    int firstIndex = s.indexOf(c);
    if (firstIndex < 0) {
        return false;
    }
    int secondIndex = s.indexOf(c, firstIndex + 1);
    return secondIndex < 0;
}
like image 108
JB Nizet Avatar answered Sep 27 '22 22:09

JB Nizet


Matches a string with a single .:

/^[^.]*\.[^.]*$/

Same for _:

/^[^_]*_[^_]*/

The combined regex should be something like:

/^([^._]*\.[^._]*_[^._]*)|([^._]*_[^._]*\.[^._]*)$/

It should by now be obvious that indexOf is the better solution, being simpler (performance is irrelevant until it has been shown to be the bottleneck).

If interested, note how the combined regex has two terms, for "string with a single . before a single _" and vice versa. It would have six for three characters, and n! for n. It would be simpler to run both regexes and AND the result than to use the combined regex.

One must always look for a simpler solution while using regexes.

like image 41
aib Avatar answered Sep 28 '22 00:09

aib