Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a regex flavor that allows me to count the number of repetitions matched by the * and + operators?

Is there a regex flavor that allows me to count the number of repetitions matched by the * and + operators? I'd specifically like to know if it's possible under the .NET Platform.

like image 617
luvieere Avatar asked Jun 12 '10 15:06

luvieere


People also ask

What does * do in regex?

The Match-zero-or-more Operator ( * ) This operator repeats the smallest possible preceding regular expression as many times as necessary (including zero) to match the pattern. `*' represents this operator. For example, `o*' matches any string made up of zero or more `o' s.

How do you count matches in a regular expression?

To count the number of regex matches, call the match() method on the string, passing it the regular expression as a parameter, e.g. (str. match(/[a-z]/g) || []). length . The match method returns an array of the regex matches or null if there are no matches found.

What does ?! Mean in regex?

It's a negative lookahead, which means that for the expression to match, the part within (?!...) must not match. In this case the regex matches http:// only when it is not followed by the current host name (roughly, see Thilo's comment).

What is regex matching pattern?

A regex pattern matches a target string. The pattern is composed of a sequence of atoms. An atom is a single point within the regex pattern which it tries to match to the target string. The simplest atom is a literal, but grouping parts of the pattern to match an atom will require using ( ) as metacharacters.


2 Answers

You're fortunate because in fact .NET regex does this (which I think is quite unique). Essentially in every Match, each Group stores every Captures that was made.

So you can count how many times a repeatable pattern matched an input by:

  • Making it a capturing group
  • Counting how many captures were made by that group in each match
    • You can iterate through individual capture too if you want!

Here's an example:

Regex r = new Regex(@"\b(hu?a)+\b");

var text = "hahahaha that's funny but not huahuahua more like huahahahuaha";
foreach (Match m in r.Matches(text)) {
   Console.WriteLine(m + " " + m.Groups[1].Captures.Count);
}

This prints (as seen on ideone.com):

hahahaha 4
huahuahua 3
huahahahuaha 5

API references

  • CaptureCollection
like image 169
polygenelubricants Avatar answered Oct 14 '22 13:10

polygenelubricants


You can use parentheses in the expression to create a group and then use the + or * operator on the group. The Captures property of the Group can be used to determine how many times it was matched. The following example counts the number of consecutive lower-case letters at the start of a string:

var regex = new Regex(@"^([a-z])+");
var match = regex.Match("abc def");

if (match.Success)
{
    Console.WriteLine(match.Groups[1].Captures.Count);
}
like image 3
Phil Ross Avatar answered Oct 14 '22 13:10

Phil Ross