The code
String s = "y z a a a b c c z";
Pattern p = Pattern.compile("(a )+(b )+(c *)c");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
prints
a a a b c c
which is right.
But logically, the substrings
a a a b c
a a b c c
a a b c
a b c c
a b c
match the regex too.
So, how can I make the code find those substrings too, i.e. not only the most extended one, but also its children?
You can use the reluctant qualifiers such as *?
and +?
. These match as little as possible, in contrast to the standard *
and +
which are greedy, i.e. match as much as possible. Still, this only allows you to find particular "sub-matches", not all of them. Some more control can be achieved using lookahead controlling non-capturing groups, also described in the docs. But in order to really find all sub-matches, you would probably have to do stuff yourself, i.e. build the automaton to which the regex corresponds and navigate it using custom code.
You will need a lazy quantifier.
Please try the following:
Pattern p = Pattern.compile("(a )+(b )+((c )*?)c");
Please also notice, that I grouped "c
" once again, since I think that's what you want. Otherwise you would find arbitrarily many spaces, but not "c
".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With