I have the following Java regex, which I didn't write and I am trying to modify:
^class-map(?:(\\s+match-all)|(\\s+match-any))?(\\s+[\\x21-\\x7e]{1,40})$ ^ ^
It's similar to this one.
Note the first question mark. Does it mean that the group is optional? There is already a question mark after the corresponding )
. Does the colon have a special meaning in regex?
The regex compiles fine, and there are already JUnit tests that show how it works. It's just that I'm a bit confused about why the first question mark and colon are there.
It means that it is not capturing group.
The question mark and the colon after the opening parenthesis are the syntax that creates a non-capturing group. The regex Set(Value)? matches Set or SetValue. In the first case, the first (and only) capturing group remains empty. In the second case, the first capturing group matches Value.
By placing - at the start or the end of the class, it matches the literal "-" . As mentioned in the comments by Keoki Zee, you can also escape the - inside the class, but most people simply add it at the end. You can also escape the hyphen with a backslash, [a\-z] .
A regular expression followed by a plus sign (+) matches one or more occurrences of the regular expression. A regular expression followed by a question mark (?) matches zero or one occurrences of the regular expression.
(?:
starts a non-capturing group. It's no different to (
unless you're retrieving groups from the regex after use. See What is a non-capturing group? What does a question mark followed by a colon (?:) mean?.
A little late to this thread - just to build on ryanp's answer.
Assuming you have the string aaabbbccc
(a)+(b)+(c)+
This would give you the following 3 groups that matched:
['a', 'b', 'c']
Use the ?:
in the first group
(?:a)+(b)+(c)+
and you would get the following groups that matched:
['b', 'c']
Hence why it is called "non-capturing parenthesis"
Sometime you use parenthesis for other things. For example to set the bounds of the |
or operator:
"New (York|Jersey)"
In this case, you are only using the parenthesis for the or |
switch, and you don't really want to capture this data. Use the non-capturing parenthesis to indicate that:
"New (?:York|Jersey)"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With