Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why do nested parentheses cause empty strings in this regex?

Why do nested parentheses cause empty strings in this regex?

var str = "ab((cd))ef";
var arr = str.split(/([\)\(])/);
console.log(arr); // ["ab", "(", "", "(", "cd", ")", "", ")", "ef"] 

what I want to achieve is this

["ab", "(", "(", "cd", ")", ")", "ef"] 
like image 494
wubbewubbewubbe Avatar asked Nov 17 '13 11:11

wubbewubbewubbe


2 Answers

The outer parameters in your regular expression act as capturing group. From the documentation of split (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split):

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array.

You didn't say exactly what you want to achieve with your regex, perhaps you want something like this:

var str = "ab((cd))ef";
var arr = str.split(/[\)\(]+/);
console.log(arr); // ["ab", "cd", "ef"] 

EDIT:

Each parenthesis matches the regex individually, so the array looks like this (one line per parenthesis matched:

['ab', '('] // matched (
['ab', '(', '', '('] // matched ( (between the last two matches is the empty string
['ab', '(', '', '(', 'cd', ')'] // matched )
['ab', '(', '', '(', 'cd', ')', '', ')'] // matched )
['ab', '(', '', '(', 'cd', ')', '', ')', 'ef'] // string end

EDIT2:

Required output is: ["ab", "(", "(", "cd", ")", ")", "ef"]

I am not sure you can do that with one split. The fastest and safest way to do it is to just filter out the empty strings. I doubt a solution with a single split for a regexp exists.

var str = "ab((cd))ef";
var arr = str.split(/([\)\(])/).filter(function(item) { return item !== '';});
console.log(arr); 
like image 97
Tibos Avatar answered Oct 04 '22 20:10

Tibos


Interesting question!

I'm unsure as to why, but if you chain

.filter(function(el){ return el !== "";});

onto your split, you can get rid of the empty strings:

var str = "ab((cd))ef";
var arr = str.split(/([\)\(])/).filter(function(el) { return el !== "";});
console.log(arr); // ["ab", "(", "(", "cd", ")", ")", "ef"]
like image 43
Mikebert4 Avatar answered Oct 04 '22 19:10

Mikebert4