Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a regex "independent non-capturing group"?

Tags:

java

regex

From the Java 6 Pattern documentation:

Special constructs (non-capturing)

(?:X)   X, as a non-capturing group

(?>X)   X, as an independent, non-capturing group

Between (?:X) and (?>X) what is the difference? What does the independent mean in this context?

like image 938
Peter Hart Avatar asked Sep 08 '08 19:09

Peter Hart


People also ask

What is non-capturing group in regex?

tl;dr non-capturing groups, as the name suggests are the parts of the regex that you do not want to be included in the match and ?: is a way to define a group as being non-capturing. Let's say you have an email address [email protected] . The following regex will create two groups, the id part and @example.com part.

What is a capturing group regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

What is non-capturing group in regex python?

This syntax captures whatever match X inside the match so that you can access it via the group() method of the Match object. Sometimes, you may want to create a group but don't want to capture it in the groups of the match. To do that, you can use a non-capturing group with the following syntax: (?:X)

What does non capture mean?

Loss of capture, also known as noncapture, is when the myocardium does not respond to the electrical stimuli from the pacemaker or ICD. On the electrocardiogram or rhythm strip, a pacing spike can be seen with no P or QRS complex subsequently following the pacing spike. 6.


2 Answers

It means that the grouping is atomic, and it throws away backtracking information for a matched group. So, this expression is possessive; it won't back off even if doing so is the only way for the regex as a whole to succeed. It's "independent" in the sense that it doesn't cooperate, via backtracking, with other elements of the regex to ensure a match.

like image 150
erickson Avatar answered Sep 20 '22 18:09

erickson


I think this tutorial explains what exactly "independent, non-capturing group" or "Atomic Grouping" is

The regular expression a(bc|b)c (capturing group) matches abcc and abc. The regex a(?>bc|b)c (atomic group) matches abcc but not abc.

When applied to abc, both regexes will match a to a, bc to bc, and then c will fail to match at the end of the string. Here their paths diverge. The regex with the capturing group has remembered a backtracking position for the alternation. The group will give up its match, b then matches b and c matches c. Match found!

The regex with the atomic group, however, exited from an atomic group after bc was matched. At that point, all backtracking positions for tokens inside the group are discarded. In this example, the alternation's option to try b at the second position in the string is discarded. As a result, when c fails, the regex engine has no alternatives left to try.

like image 27
kajibu Avatar answered Sep 22 '22 18:09

kajibu