I'm trying to understand the following regular expression quantifier (a is just an exemplary token here):
a{n}?
How does the question mark affect the match of the above expression? And how does it differ from the following?
a{n}
I would have expected the pattern aa{1}?a
to match both aaa
and aa
for example. While it matches aaa
, aa
is not a match. The pattern a(a{1})?a
does match both, so the parentheses do make a difference here.
Note: The msdn article Quantifiers in Regular Expressions states for both:
The {n} quantifier matches the preceding element exactly n times, where n is any integer.
For {n}?
, it adds the following, not overly helpful part:
It is the lazy counterpart of the greedy quantifier {n}+.
N stands for the number of terms while An stands for the nth term it ISNT the number of terms .
Complement and Difference of SetsThe complement of a set A is denoted by A' or Ac and it is the difference of the sets U and A, where U is the universal set. i.e., A' (or) Ac = U - A. This refers to the set of all elements that are in the universal set that are not elements of set A.
++n increments the value and returns the new one. n++ increments the value and returns the old one.
Nothing. The article states:
The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.
…
The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.
Notice the text is exactly the same. Basically, adding ? does not change the behavior of the quantifier. It appears that .NET's regular expression engine supports {n}?
as a alternative to {n}
.
Interestingly, this article does appear to contain an error:
The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.
This is wrong. The lazy equivalent of {n,}
is {n,}?
which is not the same as {n}?
.
UPDATE: Newer version of the article have corrected this error.
More a notice than an answer, but good to know, in particular if you project to use a same pattern in different languages or if you decide to use an other regex library with .net.
About:
I would have expected the pattern
aa{1}?a
to match bothaaa
andaa
for example. While it matchesaaa
,aa
is not a match.
a{n}
and a{n}?
produce the same result (there are seen as the greedy and non-greedy version, but of a fixed quantifier) with most of the regex engines.
But this is not the case with Oniguruma and Onigmo regex engines. With them a{n}?
behaves like (?:a{n})?
.
Since wrappers for .net exist for these libraries, it is useful to clarify.
The same with ERE (Extended Regular Expressions) used in sed, grep and with dbms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With