Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java RegEx negative lookbehind

Tags:

I have the following Java code:

Pattern pat = Pattern.compile("(?<!function )\\w+");
Matcher mat = pat.matcher("function example");
System.out.println(mat.find());

Why does mat.find() return true? I used negative lookbehind and example is preceded by function. Shouldn't it be discarded?

like image 844
Sorin Avatar asked Aug 02 '13 11:08

Sorin


People also ask

What is negative Lookbehind regex?

In negative lookbehind the regex engine first finds a match for an item after that it traces back and tries to match a given item which is just before the main match. In case of a successful traceback match the match is a failure, otherwise it is a success.

Does Java regex support Lookbehind?

Since Java 9, we can use unbound quantifiers in lookbehinds. However, because of the memory consumption of the regex implementation, it is still recommended to only use quantifiers in lookbehinds with a sensible upper limit, for example (? <!

What is Lookbehind in regex?

Lookbehind, which is used to match a phrase that is preceded by a user specified text. Positive lookbehind is syntaxed like (? <=a)something which can be used along with any regex parameter. The above phrase matches any "something" word that is preceded by an "a" word.

How do you use negative look ahead?

Negative lookahead That's a number \d+ , NOT followed by € . For that, a negative lookahead can be applied. The syntax is: X(?! Y) , it means "search X , but only if not followed by Y ".


1 Answers

See what it matches:

public static void main(String[] args) throws Exception {
    Pattern pat = Pattern.compile("(?<!function )\\w+");
    Matcher mat = pat.matcher("function example");
    while (mat.find()) {
        System.out.println(mat.group());
    }
}

Output:

function
xample

So first it finds function, which isn't preceded by "function". Then it finds xample which is preceded by function e and therefore not "function".

Presumably you want the pattern to match the whole text, not just find matches in the text.

You can either do this with Matcher.matches() or you can change the pattern to add start and end anchors:

^(?<!function )\\w+$

I prefer the second approach as it means that the pattern itself defines its match region rather then the region being defined by its usage. That's just a matter of preference however.

like image 61
Boris the Spider Avatar answered Nov 03 '22 21:11

Boris the Spider