Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent tainting properties of the RegExp constructor in JavaScript

This is a bit of a conundrum, I have an idea of how I might be able to fix it but I'm wondering if there's a (much) easier way.

In short, whenever a regular expression is executed in JavaScript, certain properties are assigned values on the RegExp constructor. For instance:

/foo/.test('football')
//-> true

RegExp.input
//-> "football"

RegExp.rightContext
//-> "tball"

I'd like to execute a regular expression without affecting these properties. If that's not possible (and I don't think it is), I'd like to at least restore them to their previous values afterwards.

I know input/$_ is writeable, but most of the others aren't, it seems. One option might be to reconstruct a regular expression that would reapply all these values, but I think that would be quite difficult.

The reason I want this is because I'm writing a shim of a native API, and testing it using the test262 suite. The test262 suite fails on certain tests where it checks to see if the RegExp object has unexpected values for these properties.

like image 524
Andy E Avatar asked May 23 '13 11:05

Andy E


2 Answers

You can try to create a wrapper function for test:

var fTest = RegExp.test;
RegExp.test = function() {
    var bReturn = fTest.apply(RegExp, arguments);
    delete RegExp.input;
    delete RegExp.rightContext;
    return bReturn;
}
like image 173
Lex Avatar answered Oct 08 '22 16:10

Lex


This is the final result. It's a little more robust than my initial effort; it properly escapes sub-expressions, makes sure they appear in the right order and doesn't stop when it finds an empty one:

/**
 * Constructs a regular expression to restore tainted RegExp properties
 */
function createRegExpRestore () {
    var lm  = RegExp.lastMatch,
        ret = {
           input: RegExp.input
        },
        esc = /[.?*+^$[\]\\(){}|-]/g,
        reg = [],
        cap = {};

    // Create a snapshot of all the 'captured' properties
    for (var i = 1; i <= 9; i++)
        cap['$'+i] = RegExp['$'+i];

    // Escape any special characters in the lastMatch string
    lm = lm.replace(esc, '\\$0');

    // Now, iterate over the captured snapshot
    for (var i = 1; i <= 9; i++) {
        var m = cap['$'+i];

        // If it's empty, add an empty capturing group
        if (!m)
            lm = '()' + lm;

        // Else find the escaped string in lm wrap it to capture it
        else
            lm = lm.replace(m.replace(esc, '\\$0'), '($0)');

        // Push to `reg` and chop `lm`
        reg.push(lm.slice(0, lm.indexOf('(') + 1));
        lm = lm.slice(lm.indexOf('(') + 1);
    }

    // Create the property-reconstructing regular expression
    ret.exp = RegExp(reg.join('') + lm, RegExp.multiline ? 'm' : '');

    return ret;
}

It does what I originally thought to be difficult. This should restore all the properties to their former values, if you use it like so:

var 
    // Create a 'restore point' for RegExp
    old  = createRegExpRestore(),

    // Run your own regular expression
    test = someOtherRegEx.test(someValue);

// Restore the previous values by running the RegExp
old.exp.test(old.input);
like image 25
Andy E Avatar answered Oct 08 '22 17:10

Andy E