Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex exec only returning first match [duplicate]

RegExp.exec is only able to return a single match result at once.

In order to retrieve multiple matches you need to run exec on the expression object multiple times. For example, using a simple while loop:

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;

var match;
while ((match = ptrn.exec(input)) != null) {
    console.log(match);
}

This will log all matches to the console.

Note that in order to make this work, you need to make sure that the regular expression has the g (global) flag. This flag makes sure that after certain methods are executed on the expression, the lastIndex property is updated, so further calls will start after the previous result.


It is possible to call match method on the string in order to retrieve the whole collection of matches:

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;
var results = "hello world".match(ptrn);

results are (according to the regular expression):

["hello", " ", "world"]

match spec is here


I did not get what is meant by "hello" "world" in your question, is it user input or regex but I was told that RegExp object has a state -- its lastIndex position that it starts the search from. It does not return all the results at once. It brings only the first match and you need to resume .exec to get the rest of results starting from lastIndex position:

const re1 = /^\s*(\w+)/mg; // find all first words in every line
const text1 = "capture discard\n me but_not_me" // two lines of text
for (let match; (match = re1.exec(text1)) !== null;) 
      console.log(match, "next search at", re1.lastIndex);

prints

["capture", "capture"] "next search at" 7
[" me", "me"] "next search at" 19

The functional JS6 way to build iterator for your results is here

RegExp.prototype.execAllGen = function*(input) {
    for (let match; (match = this.exec(input)) !== null;) 
      yield match;
} ; RegExp.prototype.execAll = function(input) {
  return [...this.execAllGen(input)]}

Please also note how, unlike poke, much more nicely I used match variable enclosed in the for-loop.

Now, you can capture your matches easily, in one line

const matches = re1.execAll(text1)

log("captured strings:", matches.map(m=>m[1]))
log(matches.map(m=> [m[1],m.index]))
for (const match of matches) log(match[1], "found at",match.index)

which prints

"captured strings:" ["capture", "me"]

[["capture", 0], ["me", 16]]
"capture" "found at" 0
"me" "found at" 16