Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bug in JavaScript V8 regex engine when matching beginning-of-line?

I have a pretty nifty tool, underscore-cli, that's getting the strangest behavior when printing out the help / usage information.

In the usage() function, I do this to indent blocks of text (eg, the options):

str.replace(/^/, "    ");

This regex, in addition to being pretty obvious, comes straight out of TJ Hollowaychuk's commander.js code. The regex is correct.

Yet, I get bizzare spaces inserted into the middle of my usage text. like this:

  Commands:
...
     values              Retrieve all the values of an object's properties.
     extend &ltobject>     Override properties in the input data.
     defaults &ltobject>   Fill in missing properties in the input data.
     any &ltexp>           Return 'true' if any of the values in the input make the expression true.  Expression args: (value, key, list)
         all &ltexp>           Return 'true' if all values in the input make the expression true.  Expression args: (value, key, list)
     isObject            Return 'true' if the input data is an object with named properties
     isArray             Return 'true' if the input data is an array
     isString            Return 'true' if the input data is a string
...

99% chance, this HAS to be a bug in V8.

Anyone know why this happens, or what the easiest work-around would be?

Yup, turns out this IS a V8 bug, 1748 to be exact. Here's the workaround I used in the tool:

str.replace(/(^|\n), "$1    ");
like image 381
Dave Dopson Avatar asked Apr 14 '12 01:04

Dave Dopson


2 Answers

This is a bug in V8 (bug 1748):

http://code.google.com/p/v8/source/browse/branches/bleeding_edge/test/mjsunit/regress/regress-1748.js?spec=svn9504&r=9504

Here is a test for the bug:

function assertEquals(a, b, msg) { if(a !== b) { console.log("'%s' != '%s'  %s", a, b, msg); } }

var str = Array(10000).join("X");
str.replace(/^|X/g, function(m, i, s) {
  if (i > 0) assertEquals("X", m, "at position 0x" + i.toString(16));
});

On my box, it prints:

'X' != ''.  at position 0x100
'X' != ''.  at position 0x200
'X' != ''.  at position 0x300
'X' != ''.  at position 0x400
'X' != ''.  at position 0x500
'X' != ''.  at position 0x600
...

On jsfiddle, it prints nothing (the version of V8 in my Chrome browser doesn't have the bug):

http://jsfiddle.net/PqDHk/


Bug History:

From the V8 changelog, the bug was fixed in V8-3.6.5 (2011-10-05).

From the Node.js changelog, Node-0.6.5 should be using V8-3.6.6.11 !?!!?. Node.js updated from V8-3.6.4 to V8-3.7.0 (Node-0.5.10) and then downgraded to V8-3.6.6 for Node-0.6.0. So theoretically, this bug should have been fixed before Node V0.6.0. Why does it still repro on Node-0.6.5??? Odd.

Can someone with the latest (Node-0.6.15) run the test snippet above and report if it generates errors? Or i'll get around to it eventually.

Thanks to ZachB for confirming this bug on Node-0.6.15. I filed an issue (issue #3168) against node, and a fix (5d69bbf) has been applied and should be included in Node-0.6.16. :) :) :)

Until then, the workaround is to replace:

str.replace(/^/, indent);

With:

str.replace(/(^|\n)/, "$1" + indent);

UPDATE: Just for giggles, I checked this on the current node version, v0.8.1, and confirmed that the bug is indeed fixed. I haven't bothered to go back and confirm if the bug was fixed in 0.6.16 or sometime between there and the v0.8.X series.

like image 98
Dave Dopson Avatar answered Oct 15 '22 05:10

Dave Dopson


Work-around: capture the first character, and replace it with the spaces, and itself

str.replace(/^./, "    $1");

or, to make sure the line is not already indented

str.replace(/^[^\s]/, "    $1");
like image 1
Umbrella Avatar answered Oct 15 '22 05:10

Umbrella