Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression Quantifiers

I thought this would have been an "no-brainer" addition to my regEx, but of course I was proven wrong...

My current regEx returns true if the string is a symbol (-, $, +, =, (, ), {, }):

(/^[-$+)(}{]$/).test(token);

I want to add two symbols to the regEx, the assignment operator (=), and the equality operator (==). My intuition guided me to do something along the lines of to return true if there exists a token with one or two '=':

(/^[-$+)(}{]|(=){1,2}$/).test(token);

but yet if the actual token is (/^[-$+)(}{]|(=){1,2}$/).test("===") true is returned.

Can someone shed some light on my regEx shortcomings?

Thanks

like image 957
Joey Avatar asked Dec 17 '25 19:12

Joey


2 Answers

You've run into a subtle operator precedence problem.

/^[-$+)(}{]|(=){1,2}$/

^ and $ bind more tightly than |, so this is equivalent to

/(?:^[\-$+)(}{])|(?:={1,2}$)/

instead of what you probably want which is to have ^...$ enclose the |:

/^(?:[\-$+)(}{]|={1,2})$/

or simplified

/^(?:[\-$+(){}]|==?)$/

/^(?:[\-$+(){}]|==?)$/.test("===") === false;
/^(?:[\-$+(){}]|==?)$/.test("()")  === false;
/^(?:[\-$+(){}]|==?)$/.test("=")   === true;
/^(?:[\-$+(){}]|==?)$/.test("==")  === true;
/^(?:[\-$+(){}]|==?)$/.test("(")   === true;

Digression on capturing vs non-capturing parentheses

I prefer (?:...) to (...) unless I actually want to capture content because, although (?:...) is more verbose, it has fewer subtle effects on code that uses the regular expression.

Some problems with using capturing groups when you don't intend to capture content include:

  1. changing the numbering of existing groups since JS doesn't have named groups,
  2. using an operator with more effects than I need can confuse a maintainer into thinking I'm capturing content for a reason,
  3. changing the behavior of a distant exec loop, (This is mostly a problem with perlish global matches like @foo = $str =~ /(foo(bar))/g but every once in a while you'll see JS code doing something similar)
  4. changing the behavior of a (possibly variadic) replacer function defined elsewhere like
    newStr = oldStr.replace(
        regexDefinedEarlier,
        function (var_args) {
          return [].slice.call(arguments, 1, arguments.length - 2).join('');
        });
    
like image 176
Mike Samuel Avatar answered Dec 20 '25 07:12

Mike Samuel


Because of operator precedence, the or branches include the start (^) and end ($) zero-width matches. To catch = or ==, you'd have to use:

(/^([-$+)(}{]|={1,2})$/).test(token);
like image 23
Adrian Avatar answered Dec 20 '25 09:12

Adrian



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!