Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do a{n}? and a{n} differ?

Tags:

.net

regex

I'm trying to understand the following regular expression quantifier (a is just an exemplary token here):

a{n}?

How does the question mark affect the match of the above expression? And how does it differ from the following?

a{n}

I would have expected the pattern aa{1}?a to match both aaa and aa for example. While it matches aaa, aa is not a match. The pattern a(a{1})?a does match both, so the parentheses do make a difference here.


Note: The msdn article Quantifiers in Regular Expressions states for both:

The {n} quantifier matches the preceding element exactly n times, where n is any integer.

For {n}?, it adds the following, not overly helpful part:

It is the lazy counterpart of the greedy quantifier {n}+.

like image 208
Marius Schulz Avatar asked Aug 01 '13 23:08

Marius Schulz


People also ask

What is the difference between A and N?

N stands for the number of terms while An stands for the nth term it ISNT the number of terms .

What are the differences and complement of sets?

Complement and Difference of SetsThe complement of a set A is denoted by A' or Ac and it is the difference of the sets U and A, where U is the universal set. i.e., A' (or) Ac = U - A. This refers to the set of all elements that are in the universal set that are not elements of set A.

What is the difference between n ++ and ++ n?

++n increments the value and returns the new one. n++ increments the value and returns the old one.


2 Answers

Nothing. The article states:

The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.

The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.

Notice the text is exactly the same. Basically, adding ? does not change the behavior of the quantifier. It appears that .NET's regular expression engine supports {n}? as a alternative to {n}.


Interestingly, this article does appear to contain an error:

The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.

This is wrong. The lazy equivalent of {n,} is {n,}? which is not the same as {n}?.

UPDATE: Newer version of the article have corrected this error.

like image 63
p.s.w.g Avatar answered Sep 22 '22 18:09

p.s.w.g


More a notice than an answer, but good to know, in particular if you project to use a same pattern in different languages or if you decide to use an other regex library with .net.

About:

I would have expected the pattern aa{1}?a to match both aaa and aa for example. While it matches aaa, aa is not a match.

a{n} and a{n}? produce the same result (there are seen as the greedy and non-greedy version, but of a fixed quantifier) with most of the regex engines.

But this is not the case with Oniguruma and Onigmo regex engines. With them a{n}? behaves like (?:a{n})?. Since wrappers for .net exist for these libraries, it is useful to clarify.

The same with ERE (Extended Regular Expressions) used in sed, grep and with dbms.

like image 43
Casimir et Hippolyte Avatar answered Sep 22 '22 18:09

Casimir et Hippolyte