Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex MatchCollection gets too many results

Tags:

c#

regex

Using C# and Regex I have a strange situation:

string substr = "9074552545,9075420530,9075662235,9075662236,9075952311,9076246645";
MatchCollection collection = Regex.Matches(substr, @"[\d]*");

In my world the above would give me a result in 'collection' that contains 6 results. Strangly enough my collection contains 12 results and every second result is {} (empty).

I have tried rewriting it to:

string substr = "9074552545,9075420530,9075662235,9075662236,9075952311,9076246645";
Regex regex = new Regex(@"[\d]*");
MatchCollection collection = regex.Matches(substr);

But it gives me the exact same result. What am I missing here?

I am using .Net framework 4.5, C#

like image 670
olf Avatar asked Jun 10 '13 11:06

olf


1 Answers

I believe the problem is your * quantifier. It matches zero or more characters, which means an empty string is technically a match. You need to use the + quantifier, like this:

string substr = "9074552545,9075420530,9075662235,9075662236,9075952311,9076246645";
MatchCollection collection = Regex.Matches(substr, @"\d+");

// or
Regex regex = new Regex(@"\d+");
MatchCollection collection = regex.Matches(substr);

It will ensure that only strings with one or more digits are returned.

Note, I've also dropped the character class ([]) around your \d as it's completely unnecessary here.

Further Reading:

  • Quantifiers in Regular Expressions
like image 79
p.s.w.g Avatar answered Sep 19 '22 10:09

p.s.w.g