Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript regular expression literal persists between function calls

I have this piece of code:

function func1(text) {

    var pattern = /([\s\S]*?)(\<\?(?:attrib |if |else-if |else|end-if|search |for |end-for)[\s\S]*?\?\>)/g;

    var result;
    while (result = pattern.exec(text)) {
        if (some condition) {
            throw new Error('failed');
        }
        ...
    }
}

This works, unless the throw statement is executed. In that case, the next time I call the function, the exec() call starts where it left off, even though I am supplying it with a new value of 'text'.

I can fix it by writing

var pattern = new RegExp('.....');

instead, but I don't understand why the first version is failing. How is the regular expression persisting between function calls? (This is happening in the latest versions of Firefox and Chrome.)

Edit Complete test case:

<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<title>Test Page</title>
<style type='text/css'>
body {
    font-family: sans-serif;
}
#log p {
    margin:     0;
    padding:    0;
}
</style>
<script type='text/javascript'>
function func1(text, count) {

    var pattern = /(one|two|three|four|five|six|seven|eight)/g;

    log("func1");
    var result;
    while (result = pattern.exec(text)) {
        log("result[0] = " + result[0] + ", pattern.index = " + pattern.index);
        if (--count <= 0) {
            throw "Error";
        }
    }
}

function go() {
    try { func1("one two three four five six seven eight", 3); } catch (e) { }
    try { func1("one two three four five six seven eight", 2); } catch (e) { }
    try { func1("one two three four five six seven eight", 99); } catch (e) { }
    try { func1("one two three four five six seven eight", 2); } catch (e) { }
}

function log(msg) {
    var log = document.getElementById('log');
    var p = document.createElement('p');
    p.innerHTML = msg;
    log.appendChild(p);
}

</script>
</head>
<body><div>
<input type='button' id='btnGo' value='Go' onclick='go();'>
<hr>
<div id='log'></div>
</div></body>
</html>

The regular expression continues with 'four' as of the second call on FF and Chrome, not on IE7 or Opera.

like image 604
Charles Anderson Avatar asked Apr 15 '10 12:04

Charles Anderson


1 Answers

RegExp objects that are created by means of a regex literal are cached, but new RegExp always creates a new object. The cached objects also save their state, but the rules governing that aspect are apparently not very clear. Steve Levithan talks about that in this blog post (near the bottom).

like image 84
Alan Moore Avatar answered Sep 22 '22 01:09

Alan Moore