Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of matches of a regex in Javascript

People also ask

How do you count matches in a regular expression?

To count a regex pattern multiple times in a given string, use the method len(re. findall(pattern, string)) that returns the number of matching substrings or len([*re. finditer(pattern, text)]) that unpacks all matching substrings into a list and returns the length of it as well.

How do you count matches in Java?

This method, added in Java 9, returns a sequential stream of match results, allowing us to count the matches more easily: long count = countEmailMatcher. results() . count(); assertEquals(3, count);

What are curly braces regex?

The curly brackets are used to match exactly n instances of the proceeding character or pattern. For example, "/x{2}/" matches "xx".

What is regexp test in JavaScript?

JavaScript RegExp test() The test() method tests for a match in a string. If it finds a match, it returns true, otherwise it returns false.


tl;dr: Generic Pattern Counter

// THIS IS WHAT YOU NEED
const count = (str) => {
  const re = /YOUR_PATTERN_HERE/g
  return ((str || '').match(re) || []).length
}

For those that arrived here looking for a generic way to count the number of occurrences of a regex pattern in a string, and don't want it to fail if there are zero occurrences, this code is what you need. Here's a demonstration:

/*
 *  Example
 */

const count = (str) => {
  const re = /[a-z]{3}/g
  return ((str || '').match(re) || []).length
}

const str1 = 'abc, def, ghi'
const str2 = 'ABC, DEF, GHI'

console.log(`'${str1}' has ${count(str1)} occurrences of pattern '/[a-z]{3}/g'`)
console.log(`'${str2}' has ${count(str2)} occurrences of pattern '/[a-z]{3}/g'`)

Original Answer

The problem with your initial code is that you are missing the global identifier:

>>> 'hi there how are you'.match(/\s/g).length;
4

Without the g part of the regex it will only match the first occurrence and stop there.

Also note that your regex will count successive spaces twice:

>>> 'hi  there'.match(/\s/g).length;
2

If that is not desirable, you could do this:

>>> 'hi  there'.match(/\s+/g).length;
1

As mentioned in my earlier answer, you can use RegExp.exec() to iterate over all matches and count each occurrence; the advantage is limited to memory only, because on the whole it's about 20% slower than using String.match().

var re = /\s/g,
count = 0;

while (re.exec(text) !== null) {
    ++count;
}

return count;

(('a a a').match(/b/g) || []).length; // 0
(('a a a').match(/a/g) || []).length; // 3

Based on https://stackoverflow.com/a/48195124/16777 but fixed to actually work in zero-results case.


('my string'.match(/\s/g) || []).length;


Here is a similar solution to @Paolo Bergantino's answer, but with modern operators. I'll explain below.

    const matchCount = (str, re) => {
      return str?.match(re)?.length ?? 0;
    };

    // usage
    
    let numSpaces = matchCount(undefined, /\s/g);
    console.log(numSpaces); // 0
    numSpaces = matchCount("foobarbaz", /\s/g);
    console.log(numSpaces); // 0
    numSpaces = matchCount("foo bar baz", /\s/g);
    console.log(numSpaces); // 2

?. is the optional chaining operator. It allows you to chain calls as deep as you want without having to worry about whether there is an undefined/null along the way. Think of str?.match(re) as

if (str !== undefined && str !== null) {
    return str.match(re);
} else {
    return undefined;
}

This is slightly different from @Paolo Bergantino's. Theirs is written like this: (str || ''). That means if str is falsy, return ''. 0 is falsy. document.all is falsy. In my opinion, if someone were to pass those into this function as a string, it would probably be because of programmer error. Therefore, I'd rather be informed I'm doing something non-sensible than troubleshoot why I keep on getting a length of 0.

?? is the nullish coalescing operator. Think of it as || but more specific. If the left hand side of || evaluates to falsy, it executes the right-hand side. But ?? only executes if the left-hand side is undefined or null.

Keep in mind, the nullish coalescing operator in ?.length ?? 0 will return the same thing as using ?.length || 0. The difference is, if length returns 0, it won't execute the right-hand side... but the result is going to be 0 whether you use || or ??.

Honestly, in this situation I would probably change it to || because more JavaScript developers are familiar with that operator. Maybe someone could enlighten me on benefits of ?? vs || in this situation, if any exist.

Lastly, I changed the signature so the function can be used for any regex.

Oh, and here is a typescript version:

    const matchCount = (str: string, re: RegExp) => {
      return str?.match(re)?.length ?? 0;
    };