Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Firefox bad RegEx performance

I use the JavaScript parser generator JISON to create a parser for some scripts that my users create. Lately I've noticed that the parsing process on Firefox is by a large factor slower than on any other Browser my page supports (IE10, latest Chrome & Opera).

After digging a little into the source of the generated parser I've narrowed the problem down to one line of code which executes some regex to tokenize the code to parse. Of course this line is executed pretty often.

I've created a little test case with some random string (~ 1300 characters long) and a pretty generic regex. This test case measures the average time it takes to execute the regex 10000 times (Working example on JSFiddle):

$(document).ready(function() {
    var str = 'asdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj',
        regex = new RegExp('^([0-9])+'),
        durations = [],
        resHtml = 'Durations:',
        totalDuration = 0,
        matches, start;

    // Perform "timing test" 10 times to get some average duration
    for (var i = 0; i < 10; i++) {
        // Execute regex 10000 times and see how long it takes
        start = window.performance.now();
        for (var j = 0; j < 10000; j++) {
            regex.exec(str);
        }
        durations.push(window.performance.now() - start);
    }

    // Create output string and update DIV
    for (var i = 0; i < durations.length; i++) {
        totalDuration += durations[i];
        resHtml += '<br>' + i + ': ' + (parseInt(durations[i] * 100, 10) / 100) + ' ms';
    }
    resHtml += '<br>==========';
    resHtml += '<br>Avg: ' + (parseInt((totalDuration / durations.length) * 100, 10) / 100) + ' ms';

    $('#result').html(resHtml);
});

Following are the test results on my machine:

Firefox 24: Average time is between 370 & 450 ms for 10000 regex executions
Chrome 30, Opera 17, IE 10: Average time is between 0.3 & 0.6 ms

This difference gets even larger if the string to test gets bigger. A 6000 character long string increases the average time in Firefox to ~ 1.5 seconds (!) while the other browsers still need ~ 0.5 milliseconds (!) (Working example on JSFiddle with 6000 characters).

Why is there such a big performance difference between Firefox and all other browsers and can I improve it anyhow?

Please note that I can't adjust the executed regexes themeselves because they are mostly generated by the parser generator and I don't want to manually change the built parser code.

like image 708
suamikim Avatar asked Oct 30 '13 10:10

suamikim


1 Answers

It is the RegExp capturing grouping that got you:

/^[0-9]+/ and/or /^(?:[0-9])+/ and/or /^([0-9]+)/ are orders of magnitude faster than /^([0-9])+/. And they should be viable alternatives.

I would expect it to be slightly slower with capturing groups, but that it is that much slower surprises me. However the slow version has the potential to create lots and lots of captures, while the other versions do not, so that seems to be an important difference.

Unscientific jsperf.

You may want to file a bug.

like image 169
nmaier Avatar answered Sep 29 '22 01:09

nmaier