Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group Numbering With Optional Blocks in a Regular Expression

Tags:

java

regex

Is there any way to have an expression in brackets not be caught in a group?

E.g. i have an expression something like this:

(A(B|C)?) D (E(F|G)?)

Take note of the optional blocks (B|C)? and (F|G)? needing brackets.
I'm not interested in what was caught in these groups. All i want is to catch the full first and last block.

But because of the optional blocks, the group numbering will change and i can't tell if (E(F|G)?) was caught as group 2 or 3.

Can i tell the expression to ignore the optional parts in the result groups, so the group numbering will stay the same? Or can i make optional catches always appear in groups - even when they're null?

like image 238
Stroboskop Avatar asked Feb 10 '10 12:02

Stroboskop


2 Answers

(E(F|G)?) will always be caught as group 3. The numbering is determined by the order of opening parentheses in the pattern string, which is:

(A(B|C)?) D (E(F|G)?)
^ ^         ^ ^
1 2         3 4

If (B|C) does not occur in the input string then group(2) will return null, but the subsequent groups will not be renumbered.

The only groups that do not influence numbering are non-capturing groups, e.g.

(A(?:B|C)?) D (E(?:F|G)?)
^             ^
1             2

Example:

Pattern pattern = Pattern.compile("(A(B|C)?) D (E(F|G)?)");
Matcher matcher = pattern.matcher("A D EG");
if (matcher.matches()) {
    System.err.println(matcher.group(1));
    System.err.println(matcher.group(2));
    System.err.println(matcher.group(3));
    System.err.println(matcher.group(4));
}

Output:

A
null
EG
G
like image 170
finnw Avatar answered Nov 08 '22 17:11

finnw


There are non-capturing groups (?:…):

(A(?:B|C)?) D (E(?:F|G)?)

The match of such a group can not be referenced.

like image 32
Gumbo Avatar answered Nov 08 '22 17:11

Gumbo