I have a string and need to split it by whitespace but if there would be some words inside brackets I need to skip it.
For example,
input: 'tree car[tesla BMW] cat color[yellow blue] dog'
output: ['tree', 'car[tesla BMW]', 'cat', 'color[yellow blue]', 'dog']
if I use simple .split(' ')
it would go inside brackets and return an incorrect result.
Also, I've tried to write a regex, but unsuccessfully :(
My last regex looks like this .split(/(?:(?<=\[).+?(?=\])| )+/)
and return ["tree", "car[", "]", "cat", "color[", "]", "dog"]
Would be really grateful for any help
This is easier with match
:
input = 'tree car[tesla BMW] cat xml:cat xml:color[yellow blue] dog'
output = input.match(/[^[\]\s]+(\[.+?\])?/g)
console.log(output)
With split
you need a lookahead like this:
input = 'tree car[tesla BMW] cat color[yellow blue] dog'
output = input.split(/ (?![^[]*\])/)
console.log(output)
Both snippets only work if brackets are not nested, otherwise you'd need a parser rather than a regexp.
You could split on a space asserting to the right 1 or more non whitespace chars except for square brackets and optionally match from an opening till closing square bracket followed by a whitespace boundary at the right.
[ ](?=[^\][\s]+(?:\[[^\][]*])?(?!\S))
Explanation
[ ]
Match a space (square brackets only for clarity)(?=
Postive lookahead
[^\][\s]+
Match 1+ times any char except ]
[
or a whitespace char(?:\[[^\][]*])?
Optinally match from [...]
(?!\S)
A whitespace boundary to the right)
Close lookaheadRegex demo
const regex = / (?=[^\][\s]+(?:\[[^\][]*])?(?!\S))/g;
[
"tree car[tesla BMW] cat color[yellow blue] dog",
"tree car[tesla BMW] cat xml:cat xml:color[yellow blue] dog",
"tree:test car[tesla BMW]",
"tree car[tesla BMW] cat color yellow blue] dog",
"tree car[tesla BMW] cat color[yellow blue dog"
].forEach(s => console.log(s.split(regex)));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With