Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java find value in a string using regex

Tags:

java

regex

java-8

I'm wondering about the behavior of using the matcher in java.

I have a pattern which I compiled and when running through the results of the matcher i don't understand why a specific value is missing.

My code:

String str = "star wars";
Pattern p = Pattern.compile("star war|Star War|Starwars|star wars|star wars|pirates of the caribbean|long strage trip|drone|snatched (2017)");
Matcher matcher = p.matcher(str);
while (matcher.find()) {
        System.out.println("\nRegex : " matcher.group());
    }

I get hit with "star war" which is right as it is in my pattern.

But I don't get "star wars" as a hit and I don't understand why as it is part of my pattern.

like image 489
omri_saadon Avatar asked May 17 '26 18:05

omri_saadon


1 Answers

The behavior is expected because alternation in NFA regex is "eager", i.e. the first match wins, and the rest of the alternatives are not even tested against. Also, note that once a regex engine finds a match in a consuming pattern (and yours is a consuming pattern, it is not a zero-width assertion like a lookahead/lookbehind/word boundary/anchor) the index is advanced to the end of the match and the next match is searched for from that position.

So, once your first star war alternative branch matches, there is no way to match star wars as the regex index is before the last s.

Just check if the string contains the strings you check against, the simplest approach is with a loop:

String str = "star wars";
String[] arr = {"star war","Star War","Starwars","star wars","pirates of the caribbean","long strage trip","drone","snatched (2017)"};
for(String s: arr){
    if(str.contains(s))
        System.out.println(s);
}

See the Java demo

By the way, your regex contains snatched (2017), and it does not match ( and ), it only matches snatched 2017. To match literal parentheses, the ( and ) must be escaped. I also removed a dupe entry for star wars.

like image 114
Wiktor Stribiżew Avatar answered May 19 '26 07:05

Wiktor Stribiżew