Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do string.split() results include undefined?

I would like to split a string on either %\d+ or \n. I was able to successfully split on either one of these two, but not on both:

> msg = 'foo %1 bar \n baz %2'

> msg.split(/(%\d+)/)
["foo ", "%1", " bar 
 baz ", "%2", ""]

> msg.split(/(\n)/)
["foo %1 bar ", "
", " baz %2"]

> msg.split(/(\n)|(%\d)/)
["foo ", undefined, "%1", " bar ", "
", undefined, " baz ", undefined, "%2", ""]

In the last case, why is undefined in the resulting array, and what should I be doing?

Update: I neglected to state that I need the delimiters. The result I want is:

["foo ", "%1", " bar ", "\n", " baz ", "%2"]
like image 669
Ellen Spertus Avatar asked Dec 19 '13 23:12

Ellen Spertus


1 Answers

Quoting the MDN doc for String.prototype.split:

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array.

The point is that any capturing group is spliced - even the one that misses the target. The first undefined in your example is the 'nothingness' matched by \n (split occured when %\d matched), the second is for %\d (when \n was matched)... you see the picture.

To solve this, you can get rid of capturing groups (as alternation operator has the lowest precedence anyway):

msg.split(/\n|%\d/); // ["foo ", " bar ", " baz ", ""]

If you do need that separating parts as well, use just a single capturing group:

msg.split(/(\n|%\d)/); 
// ["foo ", "%1", " bar ", "\n", " baz ", "%2", ""]
like image 126
raina77ow Avatar answered Nov 14 '22 21:11

raina77ow