Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In C# regular expression why does the initial match show up in the groups?

Tags:

c#

regex

So if I write a regex it's matches I can get the match or I can access its groups. This seems counter intuitive since the groups are defined in the expression with braces "(" and ")". It seems like it is not only wrong but redundant. Any one know why?

Regex quickCheck = new Regex(@"(\D+)\d+");
string source = "abc123";

m.Value        //Equals source
m.Groups.Count //Equals 2
m.Groups[0])   //Equals source
m.Groups[1])   //Equals "abc"
like image 629
QueueHammer Avatar asked Feb 11 '10 22:02

QueueHammer


People also ask

What does << mean in C?

<< is the left shift operator. It is shifting the number 1 to the left 0 bits, which is equivalent to the number 1 .


1 Answers

I agree - it is a little strange, however I think there are good reasons for it.

A Regex Match is itself a Group, which in turn is a Capture.

But the Match.Value (or Capture.Value as it actually is) is only valid when one match is present in the string - if you're matching multiple instances of a pattern, then by definition it can't return everything. In effect - the Value property on the Match is a convenience for when there is only match.

But to clarify where this behaviour of passing the whole match into Groups[0] makes sense - consider this (contrived) example of a naive code unminifier:

[TestMethod]
public void UnMinifyExample()
{
  string toUnMinify = "{int somevalue = 0; /*init the value*/} /* end */";
  string result = Regex.Replace(toUnMinify, @"(;|})\s*(/\*[^*]*?\*/)?\s*", "$0\n");
  Assert.AreEqual("{int somevalue = 0; /*init the value*/\n} /* end */\n", result);
}

The regex match will preserve /* */ comments at the end of a statement, placing a newline afterwards - but works for either ; or } line-endings.

Okay - you might wonder why you'd bother doing this with a regex - but humour me :)

If Groups[0] generated by the matches for this regex was not the whole capture - then a single-call replace would not be possible - and your question would probably be asking why doesn't the whole match get put into Groups[0] instead of the other way round!

like image 59
Andras Zoltan Avatar answered Oct 18 '22 23:10

Andras Zoltan