Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex in .NET: joining duplicate named captured groups

Tags:

.net

regex

Given the expression ^(?<res>a).*(?<res>c) and test string abc, expected named group res to concat both found values and get ac, but got latest change - c.

Any way for C#'s regex class to support concat for named groups within regex?

Related question is Regex issue with named captured pairs, and by the chance it says that Perl/PCRE does not supports duplicate named pairs, but here I got .NET, and looking for it's specific magic to make regex return a single match that contains both found values from different parts of string (that is, abbbbbcdef should return ac).

Calling regex more than once or joining resulting groups in code is not a well-tolerated solution now - looking to do the whole job inside regex.

like image 927
kagali-san Avatar asked Sep 01 '11 16:09

kagali-san


2 Answers

The purpose of non-unique group names is merely to provide more flexibility in capturing parts of the string. Taking the captured parts and reassembling them differently is something you do after the regex has matched--typically with the Replace method:

string s0 = @"abbbbbcdef";
string s1 = Regex.Replace(s0, @"^.*(a).*(c).*$", "$1$2");
Console.WriteLine(s1);

output:

ac

This questions reminds me of others I've seen where people wanted the regex to "skip" the parts of the string they weren't interested in--that is, to consume some parts but not others. There's no way to do that in any of the regex flavors I'm familiar with.

like image 59
Alan Moore Avatar answered Nov 15 '22 04:11

Alan Moore


var match = re.Match(s);
var captures = match.Groups["res"].Captures.Cast<Capture>().Select(c => c.Value);
var result = string.Concat(captures);

The Cast() is necessary, because the collection that Captures returns doesn't implement IEnumerable<T>.

like image 20
svick Avatar answered Nov 15 '22 05:11

svick