Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count regex matches with streams

I am trying to count the number of matches of a regex pattern with a simple Java 8 lambdas/streams based solution. For example for this pattern/matcher :

final Pattern pattern = Pattern.compile("\\d+");
final Matcher matcher = pattern.matcher("1,2,3,4");

There is the method splitAsStream which splits the text on the given pattern instead of matching the pattern. Although it's elegant and preserves immutability, it's not always correct :

// count is 4, correct
final long count = pattern.splitAsStream("1,2,3,4").count();

// count is 0, wrong
final long count = pattern.splitAsStream("1").count();

I also tried (ab)using an IntStream. The problem is I have to guess how many times I should call matcher.find() instead of until it returns false.

final long count = IntStream
        .iterate(0, i -> matcher.find() ? 1 : 0)
        .limit(100)
        .sum();

I am familiar with the traditional solution while (matcher.find()) count++; where count is mutable. Is there a simple way to do that with Java 8 lambdas/streams ?

like image 342
Manos Nikolaidis Avatar asked Dec 30 '15 14:12

Manos Nikolaidis


People also ask

Can regex count matches?

To count the number of regex matches, call the match() method on the string, passing it the regular expression as a parameter, e.g. (str. match(/[a-z]/g) || []). length . The match method returns an array of the regex matches or null if there are no matches found.

What is the purpose of the curly brackets {} in regular expression?

The curly brackets are used to match exactly n instances of the proceeding character or pattern. For example, "/x{2}/" matches "xx".


1 Answers

To use the Pattern::splitAsStream properly you have to invert your regex. That means instead of having \\d+(which would split on every number) you should use \\D+. This gives you ever number in your String.

final Pattern pattern = Pattern.compile("\\D+");
// count is 4
long count = pattern.splitAsStream("1,2,3,4").count();
// count is 1
count = pattern.splitAsStream("1").count();
like image 147
Flown Avatar answered Sep 22 '22 16:09

Flown