Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java pattern matcher group definition

Tags:

java

regex

I have a simple regular expression which looks something like

([a-z]*)( +[a-z]="[0-9]")*

and it works in matching patterns like

test a="1" b="2" c="3"...

Is there any way of capturing each of the name-value pairs (e.g., a="1") in a separate matcher group?

As it is in the above example, I get a matcher group for (test) and only one matcher group for the 3 name-value pairs (i.e., the last one, c="3"). I would expect 3 matcher groups, 1 for each such pair.

like image 442
PNS Avatar asked Jun 12 '11 20:06

PNS


1 Answers

I would expect 3 matcher groups, 1 for each such pair.

No, it's two groups in total. The only way to get the key-value pairs in three groups, is by doing:

([a-z]*)( +[a-z]="[0-9]")( +[a-z]="[0-9]")( +[a-z]="[0-9]")

You could match all key value pairs in a single group and then use a separate Pattern & Matcher on it:

import java.util.regex.*;

public class Main {
  public static void main(String[] args) throws Exception {

    String text = "test a=\"1\" b=\"2\" c=\"3\" bar d=\"4\" e=\"5\"";
    System.out.println(text + "\n");

    Matcher m1 = Pattern.compile("([a-z]*)((?:[ \t]+[a-z]=\"[0-9]\")*)").matcher(text);

    while(m1.find()) {

      System.out.println(m1.group(1));

      Matcher m2 = Pattern.compile("([a-z])=\"([0-9])\"").matcher(m1.group(2));

      while (m2.find()) {
        System.out.println("  " + m2.group(1) + " -> " + m2.group(2));
      }
    }
  }
}

which produces:

test a="1" b="2" c="3" bar d="4" e="5"

test
  a -> 1
  b -> 2
  c -> 3

bar
  d -> 4
  e -> 5
like image 147
Bart Kiers Avatar answered Sep 23 '22 09:09

Bart Kiers