The following is a regex that picks out pertinent tokens to construct an s-expression from a JS string. It is followed by an enormous block comment that documents how it is built up to do this. I included it because I am new to regex, and maybe I am not understanding one of these points. What I don't understand is why each match regex.exec() returns should be the same match repeated twice and grouped as a list?
var tx = /\s*(\(|\)|[^\s()]+|$)/g; // Create a regular expression
/*       /1 234  5  6      7   /global search
        1. \s      : whitespace metacharacter
        2. n*      : matches any string that contains zero or more 
                     occurrences of n
        3. (a|b|c) : find any of the alternatives specified
        4. \(      : escaped open paren, match "(" (since parens are reserved 
                     characters in regex)
        5. \)      : escaped close paren, match ")"
        6. [^abc]  : find any character not between the brackets
        7. n+      : matches any string that contains at least one n
RESULT - Find matches that have zero or more leading whitespace characters (1+2) 
that are one of the following (3): open paren (4) -OR- close paren (5)
-OR- any match that is at least one non-whitespace, non-paren character (6+7) 
-OR- $, searching globally to find all matches */
var textExpression = "(1 2 3)";
var execSample;
for(var i =0; i < textExpression.length; i++){
    execSample = tx.exec(textExpression)
    display( execSample );
}
Here is what is printed:
(,(
1,1
 2,2
 3,3
),)
,
null
Why are the matches repeated as lists?
You're NOT getting exactly same items in the printed list.
$0$1If you change your regex to this:
var tx = /\s*(?:\(|\)|[^\s()]+|$)/g;
Then you will get single item in the printed list.
It's because you've got that parenthesized group in your regular expression. The .exec() function returns an array. In the array, the first element (element 0) will contain the entire match, and then the subsequent elements contain the matched groups.
If you don't want that, you can use a non-capturing group:
var tx = /\s*(?:\(|\)|[^\s()]+|$)/g; 
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With