Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get named group subpattern from .NET regex object

Tags:

c#

.net

regex

Let's say I have the following regex:

var r = new Regex("Space(?<entry>[0-9]{1,3})");

Then I have the string:

"Space123"

Here is my program:

void Main()
{
    Regex r = new Regex("Space(?<entry>[0-9]{1,3})", RegexOptions.ExplicitCapture);
    foreach (Match m in r.Matches("Space123")){
        m.Groups["entry"].Dump(); //Dump() is linqpad to echo the object to console
    }
}

What I want to know is if there is any way to to get the regular expression part that matched? In this case:

(?<entry>[0-9]{1,3})

I can't find it anywhere in the object, but one would think it would be accessible.

like image 254
Matt Avatar asked May 04 '15 23:05

Matt


People also ask

Can I use regex named groups?

Named groups that share the same name are treated as one an the same group, so there are no pitfalls when using backreferences to that name. If a regex has multiple groups with the same name, backreferences using that name point to the leftmost group in the regex with that name.

How do I refer a group in regex?

For instance, the regex \b(\w+)\b\s+\1\b matches repeated words, such as regex regex, because the parentheses in (\w+) capture a word to Group 1 then the back-reference \1 tells the engine to match the characters that were captured by Group 1. Yes, capture groups and back-references are easy and fun.

How do capture groups work regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

Can I use named capturing groups?

Numbers for Named Capturing Groups. Mixing named and numbered capturing groups is not recommended because flavors are inconsistent in how the groups are numbered. If a group doesn't need to have a name, make it non-capturing using the (?:group) syntax.


Video Answer


1 Answers

You can leverage the Regex.ToString() method that stores the regular expression pattern. Named capure groups and their respective indices can be obtained using Regex.GetGroupNames() and Regex.GetGroupNumbers().

Also, we need an array/list of the capture groups inside the regex pattern, that is why I am adding rxPairedRoundBrackets regex to capture all texts inside unescaped round brackets.

I suggest this code to get the regex subpattern for a specific named group (edit: now, even handling nested unescaped parenthetical groups!):

var rxPairedRoundBrackets = new Regex(@"(?sx)(?=((?<=[^\\]|^)\(
        (?>
          (?! (?<!\\)\( | (?<!\\)\) ) .
          |
          (?<!\\)\( (?<Depth>)
          |
          (?<!\\)\) (?<-Depth>)
        )*
        (?(Depth)(?!))
        (?<!\\)\)))+");
var r = new Regex(@"(?<OuterSpace>Spa(?<ce>ce))(?<entry>\([0-9]{1,3}\))", RegexOptions.ExplicitCapture);
var bracketedGrps = rxPairedRoundBrackets.Matches(r.ToString()).Cast<Match>().Select(p => p.Groups[1].Value);
var GroupDict = r.GetGroupNames().Zip(r.GetGroupNumbers(), (s, i) => new { s, i })
                                 .ToDictionary(item => item.s, item => item.i);
foreach (Match m in r.Matches("My New Space(123)"))
{
    var id = "entry";
    var grp = m.Groups[id]; // Just to see the group value
    var groupThatMatched = bracketedGrps.ElementAt(GroupDict[id] - 1);
}

And here is the code for your case:

r = new Regex("Space(?<entry>[0-9]{1,3})", RegexOptions.ExplicitCapture);
bracketedGrps = rxPairedRoundBrackets.Matches(r.ToString()).Cast<Match>().Select(p => p.Groups[1].Value);
GroupDict = r.GetGroupNames().Zip(r.GetGroupNumbers(), (s, i) => new { s, i })
                             .ToDictionary(item => item.s, item => item.i);
foreach (Match m in r.Matches("Space123"))
{
   var id = "entry";
   var grp = m.Groups[id];
   var groupThatMatched = bracketedGrps.ElementAt(GroupDict[id] - 1);
}

Output:

enter image description here

like image 159
Wiktor Stribiżew Avatar answered Oct 23 '22 06:10

Wiktor Stribiżew