Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Regex pattern matching first occurrence of “boundary” after any character sequence

Tags:

java

regex

I want to set a pattern which will find a capture group limited by the first occurrence of the “boundary”. But now the last boundary is used.

E.g.:

String text = "this should match from A to the first B and not 2nd B, got that?";
Pattern ptrn = Pattern.compile("\\b(A.*B)\\b");
Matcher mtchr = ptrn.matcher(text);
while(mtchr.find()) {
    String match = mtchr.group();
    System.out.println("Match = <" + match + ">");
}

prints:

"Match = <A to the first B and not 2nd B>"

and I want it to print:

"Match = <A to the first B>"

What do I need to change within the pattern?

like image 705
amphibient Avatar asked Oct 11 '12 21:10

amphibient


People also ask

What does \b mean in regex Java?

In Java, "\b" is a back-space character (char 0x08 ), which when used in a regex will match a back-space literal.

How do you match a character sequence in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

Which pattern is used to match any non What character?

The expression \w will match any word character. Word characters include alphanumeric characters ( - , - and - ) and underscores (_). \W matches any non-word character.

How do you escape a character in regex Java?

Characters can be escaped in Java Regex in two ways which are listed as follows which we will be discussing upto depth: Using \Q and \E for escaping. Using backslash(\\) for escaping.


2 Answers

Make your * non-greedy / reluctant using *?:

Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");

By default, the pattern will behave greedily, and match as many characters as possible to satisfy the pattern, that is, up until the last B.

See Reluctant Quantifiers from the docs, and this tutorial.

like image 64
pb2q Avatar answered Oct 07 '22 16:10

pb2q


Don't use a greedy expression for matching, i.e.:

Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");
like image 32
Reimeus Avatar answered Oct 07 '22 16:10

Reimeus