Here is my regular expression:
"button:not([DISABLED])".match(/\([^()]+\)|[^()]+/g);
The result is:
["button:not", "([DISABLED])"]
Is it correct? I'm confused. Because the (pipe) operator |
means "or", I think the correct result is:
["button:not", "[DISABLED]", "([DISABLED])"]
Because this:
["button:not", "[DISABLED]"]
is the result of:
"button:not([DISABLED])".match(/[^()]+/g);
and this:
["([DISABLED])"]
is the result of:
"button:not([DISABLED])".match(/\([^()]+\)/g);
But the result output in console tell me the result is:
["button:not", "([DISABLED])"]
Where is the problem?
The regex
/\([^()]+\)|[^()]+/g
Basically says: There are two options, match (1) \([^()]+\)
OR (2) [^()]+
, wherever you see any of them (/g
).
Let's iterate at your sample string so you understand the reason behind the obtained result.
Starting string:
button:not([DISABLED])
Steps:
b
(actually it begins at the start-of-string anchor, ^
, but for this example it is irrelevant).b
can only match the (2), as the (1) requires a starting (
. (
or )
.t
char (because the next char is a (
which does not match [^()]+
) thus leaving button:not
as first matched string.(
. Does it begin to match any of the options? Yes, the first one: \([^()]+\)
. (
or )
until it finds a )
(if while consuming it finds a (
before a )
, it will backtrack as that will mean the (1) regex was ultimately not matched).)
, leaving then ([DISABLED])
as second matched string.Edit: There's a very useful online tool that allows you to see the regex in a graphical form. Maybe it helps to understand how the regex will work:
You can also move the cursor step by step and see what I tried to explain above: live link.
Note about the precedence of expressions separed by |
: Due to the way the JavaScript regex engine process the strings, the order in which the expressions appear matter. It will evaluate each alternative in the order they are given. If one is those options is matched to the end, it will not attempt to match any other option, even if it could. Hopefully an example makes it clearer:
"aaa".match(/a|aa|aaa/g); // ==> ["a", "a", "a"] "aaa".match(/aa|aaa|a/g); // ==> ["aa", "a"] "aaa".match(/aaa|a|aa/g); // ==> ["aaa"]
Your understanding of the alternation operator seems to be incorrect. It does not look for all possible matches, only for the first one that matches (from left to right).
Consider (a | b)
as "match either a
or b
".
See also: http://www.regular-expressions.info/alternation.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With