Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex and proper capture using .matches .Concat in C#

I have the following regex:

@"{thing:(?:((\w)\2*)([^}]*?))+}"

I'm using it to find matches within a string:

MatchCollection matches = regex.Matches(string);
       IEnumerable formatTokens = matches[0].Groups[3].Captures
                                   .OfType<Capture>()
                                   .Where(i => i.Length > 0)
                                   .Select(i => i.Value)
                                   .Concat(matches[0].Groups[1].Captures.OfType<Capture>().Select(i => i.Value));

This used to yield the results I wanted; however, my goal has since changed. This is the desired behavior now:

Suppose the string entered is 'stuff/{thing:aa/bb/cccc}{thing:cccc}'

I want formatTokens to be:

formatTokens[0] == "aa/bb/cccc"
formatTokens[1] == "cccc"

Right now, this is what I get:

formatTokens[0] == "/"
formatTokens[1] == "/"
formatTokens[2] == "cccc"
formatTokens[3] == "bb"
formatTokens[4] == "aa"

Note especially that "cccc" does not appear twice even though it was entered twice.

I think the problems are 1) the recapture in the regex and 2) the concat configuration (which is from when I wanted everything separated), but so far I haven't been able to find a combination that yields what I want. Can someone shed some light on the proper regex/concat combination to yield the desired results above?

like image 497
Courtney Thurston Avatar asked Jun 19 '18 23:06

Courtney Thurston


1 Answers

You may use

Regex.Matches(s, @"{thing:([^}]*)}")
    .Cast<Match>()
    .Select(x => x.Groups[1].Value)
    .ToList()

See the regex demo

Details

  • {thing: - a literal {thing: substring
  • ([^}]*) - Capturing group #1 (when a match is obtained, its value can be accessed via match.Groups[1].Value): 0+ chars other than }
  • } - a } char.

This way, you find multiple matches and only collect Group 1 values in the resulting list/array.

like image 134
Wiktor Stribiżew Avatar answered Oct 17 '22 22:10

Wiktor Stribiżew