Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The meaning of \\1* operator in Java regexes [duplicate]

Tags:

java

regex

I am learning about Java regexes, and I noticed the following operator:

\\*1

I'm having hard time figuring out what it means (searching in the web didn't help). For example, what is the difference between these two options:

    Pattern p1 = Pattern.compile("(a)\\1*"); // option1
    Pattern p2 = Pattern.compile("(a)"); // option2

    Matcher m1 = p1.matcher("a");
    Matcher m2 = p2.matcher("a");

    System.out.println(m1.group(0));
    System.out.println(m2.group(0));

Result:

a
a

Thanks!

like image 527
Friedman Avatar asked Aug 01 '16 17:08

Friedman


2 Answers

\\1 is back reference corresponding in this case to the first capturing group which is (a) here.

So (a)\\1* is equivalent to (a)a* in this particular case.

Here is an example that shows the difference:

Pattern p1 = Pattern.compile("(a)\\1*");
Pattern p2 = Pattern.compile("(a)");

Matcher m1 = p1.matcher("aa");
Matcher m2 = p2.matcher("aa");

m1.find();
System.out.println(m1.group());
m2.find();
System.out.println(m2.group());

Output:

aa
a

As you can see when you have several a the first regular expression captures all the successive a while the second one captures only the first one.

like image 178
Nicolas Filotto Avatar answered Oct 17 '22 01:10

Nicolas Filotto


\\1* looks for a again, 0 or more times. Maybe easier to understand would be this example, using (a)\\1+, which looks for at least 2 as:

Pattern p1 = Pattern.compile("(a)\\1+");
Matcher m1 = p1.matcher("aaaaabbaaabbba");
while (m1.find()) System.out.println(m1.group());

the output will be:

aaaaa
aaa

But the last a won't match because it is not repeated.

like image 20
assylias Avatar answered Oct 17 '22 00:10

assylias