Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the same RegExp behave differently? [duplicate]

Possible Duplicate:
Interesting test of Javascript RegExp
Regular expression test can't decide between true and false (JavaScript)

Example of issue. When ran inline the results are as I would expect. But when stored as a variable it skips the middle span element.

// Inline RegExp
function getToggleClasses() {
  var toggler = [],
      elements = document.getElementsByTagName("*"),
      i=0,
      len = elements.length;

  for (i; i < len; i++) {
    if (/toggler/g.test(elements[i].className)) {
      toggler.push(elements[i]);
    }
  }

  document.getElementById('results').innerHTML += "<br />Inline: " + toggler.length;
}

// Variable
function getToggleClasses2() {
  var toggler = [],
      elements = document.getElementsByTagName("*"),
      tester = /toggler/g,
      i=0,
      len = elements.length;

  for (i; i < len; i++) {
    if (tester.test(elements[i].className)) {
      toggler.push(elements[i]);
    }
  }

  document.getElementById('results').innerHTML += "<br />Variable: " + toggler.length;
}
​

Mark up:

<span class="toggler">A</span>
<span class="toggler">B</span>
<span class="toggler">C</span>

Given: I understand there is no reason to use a RegExp to do this comparison and I also understand how great libraries such as jQuery are. I also know that the g is not needed in this case.

I can't understand why these two methods should ever return different results.

like image 840
Joe Avatar asked Jun 13 '12 19:06

Joe


3 Answers

RegExp instances are stateful, so reusing them can cause unexpected behavior. In this particular case, it's because the instance is global, meaning:

that the regular expression should be tested against all possible matches in a string.

That's not the only difference caused by using g, however. From RegExp.test @ MDN:

As with exec (or in combination with it), test called multiple times on the same global regular expression instance will advance past the previous match.


Remove the g flag, or set lastIndex to 0 (thanks, @zzzzBov).

like image 175
Matt Ball Avatar answered Oct 17 '22 11:10

Matt Ball


/g is not needed and should not be used in this case.

The behavior differs in these cases because in the "inline" case the regex object is recreated each iteration of the loop. While in the variable is created once, and keeps its state (lastIndex) between loop iterations.

Move the var into the loop and you will get the same result:

// Variable
function getToggleClasses2() {
  var toggler = [],
      elements = document.getElementsByTagName("*"),
      i=0,
      len = elements.length;

  for (i; i < len; i++) {
    var tester = /toggler/g;
    if (tester.test(elements[i].className)) {
      toggler.push(elements[i]);
    }
  }

  document.getElementById('results').innerHTML += "<br />Variable: " + toggler.length;
}
like image 3
Qtax Avatar answered Oct 17 '22 12:10

Qtax


The regex maintains a variable called lastIndex, which is the index to start the next search. From MDN:

As with exec (or in combination with it), test called multiple times on the same global regular expression instance will advance past the previous match.

When you define an inline regex for each iteration, the state is lost and lastIndex is always 0 because you have a fresh regex each time. If you keep the regex in a veriable, the lastIndex is saved as the ending position of the last match, which in this case causes the next search for begin at the end of the next string, resulting in a failed match. When the third comparison comes around, the lastIndex has been reset to 0 because the regex knows it got no results last time.

like image 1
apsillers Avatar answered Oct 17 '22 10:10

apsillers