Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently find which group matched in a RegExp search

When my RegExp has a number of capturing groups, I want to know which group made the capture (or at least the first/last such group, if there were more than one). If you're familiar with Python, this is basically the equivalent of re.MatchObject.lastgroup. Some code to make it clearer:

var re_captures = new RegExp("(\\d+)|(for)|(\\w+)", "g");
var str = " for me 20 boxes please";
var result;

while ((result = re_captures.exec(str)) !== null) {
  console.log(result[0], 'at', result.index, result.slice(1));
}

It prints:

for at 1 [ undefined, 'for', undefined ]
me at 5 [ undefined, undefined, 'me' ]
20 at 8 [ '20', undefined, undefined ]
boxes at 11 [ undefined, undefined, 'boxes' ]
please at 17 [ undefined, undefined, 'please' ]

The result array shows which groups made a capture, but I see no way to quickly find out for each given match, which group matched without iterating through the array. This comes useful in cases where large regexes are built programmatically and iterating is inefficient.

Am I missing something obvious, or isn't it possible?

like image 256
Eli Bendersky Avatar asked Jun 17 '13 14:06

Eli Bendersky


1 Answers

You’re not missing anything; iterating through the array is the only way.

How many groups could there be that iterating through the matches is actually a performance problem? If you don’t need a group, you can always make it non-capturing, but…

like image 61
Ry- Avatar answered Nov 15 '22 00:11

Ry-