Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to select repeating groups

Tags:

.net

regex

I have a series of grouped values that follow a specific format and would like to use a single expression to capture them into groups. For example, I have -group1 -group2 -group3 and am trying to use something similar to (-[\s\S]{1,}?) This is basically allowing me to capture the entire string into a single group but I'd like to be able to backreference each of the values separately. I figured the ? would force it to be non-greedy and, therefore, split the pattern match into three separate groups (for example). For now I am simply repeating the reference (-[\s\S]*?) but it seems there should be a more elegant expression.
Thanks!

like image 893
McArthey Avatar asked Jun 15 '12 13:06

McArthey


People also ask

How do you repeat a group in regex?

For example, you can repeat the contents of a group with a repeating qualifier, such as *, +, ?, or {m,n}. For example, (ab)* will match zero or more repetitions of "ab".

What is non capturing group in regex?

tl;dr non-capturing groups, as the name suggests are the parts of the regex that you do not want to be included in the match and ?: is a way to define a group as being non-capturing. Let's say you have an email address [email protected] . The following regex will create two groups, the id part and @example.com part.

What is a group in regex?

What is Group in Regex? A group is a part of a regex pattern enclosed in parentheses () metacharacter. We create a group by placing the regex pattern inside the set of parentheses ( and ) . For example, the regular expression (cat) creates a single group containing the letters 'c', 'a', and 't'.


1 Answers

You are in luck because C# is one of the few languages (if not the only one) that supports subexpression captures

https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.capture(v=vs.110)

The .NET API can be looked at as follows

 Matches
     Groups (most regex engines stop here)
         Captures (unique for .NET)

It's not clear from your question what you want to match exactly but this should get you started. Ask again if you are stuck.

  string input = "-group1 -group2 ";
  string pattern = @"(-\S*\W){2}";
  foreach (Match match in Regex.Matches(input, pattern))
  {
     Console.WriteLine("Match: {0}", match.Value);
     for (int groupCtr = 0; groupCtr < match.Groups.Count; groupCtr++)
     {
        Group group = match.Groups[groupCtr];
        Console.WriteLine("   Group {0}: {1}", groupCtr, group.Value);
        for (int captureCtr = 0; captureCtr < group.Captures.Count; captureCtr++)
           Console.WriteLine("      Capture {0}: {1}", captureCtr, 
                             group.Captures[captureCtr].Value);
     }                      
  } 

This ouputs

Match: -group1 -group2 
   Group 0: -group1 -group2 
      Capture 0: -group1 -group2 
   Group 1: -group2 
      Capture 0: -group1 
      Capture 1: -group2 

As you can see (Group 1, Capture 0) and (Group 1, Capture 1) offer the individual captures of a group (and not the last as in most languages)

This address I think of what you describe as "to be able to backreference each of the values separately"

(You use the term backreference but I don't think you are aiming for a replacement pattern right?)

like image 157
buckley Avatar answered Sep 26 '22 07:09

buckley