Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to capture each group if multiple occurrences are matched?

Tags:

java

regex

I don't know how to explain the problem in plain English, so I help myself with regexp example. I have something similar to this (the example is pretty much simplified):

((\\d+) - (\\d+)\n)+

This pattern matches these lines at once:

123 - 23
32 - 321
3 - 0
99 - 55

The pattern contains 3 groups: the first one matches a line, the 2nd one matches first number in the line, and the 3rd one matches second number in the line.

Is there a possibility to get all those numbers? Matcher has only 3 groups. The first one returns 99 - 55, the 2nd one - 99 and the 3rd one - 55.

SSCCE:

class Test {
    private static final Pattern pattern = Pattern.compile("((\\d+) - (\\d+)\n)+");

    public static void parseInput(String input) {

        Matcher matcher = pattern.matcher(input);

        if (matcher.matches()) {

            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println("------------");
                System.out.println("Group " + i + ": " + matcher.group(i));
            }
            System.out.println();
        }

    }

    public static void main(String[] args) {
        parseInput("123 - 23\n32 - 321\n3 - 0\n99 - 55\n");
    }
}
like image 363
Roman Avatar asked Nov 26 '10 11:11

Roman


1 Answers

One more remark about the answer of Mike Caron: the program will not work if you simple replace "if" with "while" and use "find" instead of "match". You should also change the regular expression: the last group with the "+" should be removed, because you want to search for multiple occurrences of this pattern, and not for one occurrence of a (..)+ group.

For clarity, this is the final program that works:

class Test {
    private static final Pattern pattern = Pattern.compile("(\\d+) - (\\d+)\n");

    public static void parseInput(String input) {

        Matcher matcher = pattern.matcher(input);

        while (matcher.find()) {

            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println("------------");
                System.out.println("Group " + i + ": " + matcher.group(i));
            }
            System.out.println();
        }
    }

    public static void main(String[] args) {
        parseInput("123 - 23\n32 - 321\n3 - 0\n99 - 55\n");
    }
}

It will give you three groups for each line, where the first group is the entire line and the two following groups each contain a number. This is a good tutorial that helped me to understand it better: http://tutorials.jenkov.com/java-regex/matcher.html

like image 95
marczoid Avatar answered Sep 27 '22 01:09

marczoid