Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string with regex skipping brackets []

I have a string and need to split it by whitespace but if there would be some words inside brackets I need to skip it.

For example,

input: 'tree car[tesla BMW] cat color[yellow blue] dog'

output: ['tree', 'car[tesla BMW]', 'cat', 'color[yellow blue]', 'dog']

if I use simple .split(' ') it would go inside brackets and return an incorrect result.

Also, I've tried to write a regex, but unsuccessfully :(

My last regex looks like this .split(/(?:(?<=\[).+?(?=\])| )+/) and return ["tree", "car[", "]", "cat", "color[", "]", "dog"]

Would be really grateful for any help

like image 286
MarkMark Avatar asked May 21 '21 12:05

MarkMark


2 Answers

This is easier with match:

input = 'tree car[tesla BMW] cat xml:cat xml:color[yellow blue] dog'

output = input.match(/[^[\]\s]+(\[.+?\])?/g)

console.log(output)

With split you need a lookahead like this:

input = 'tree car[tesla BMW] cat color[yellow blue] dog'

output = input.split(/ (?![^[]*\])/)

console.log(output)

Both snippets only work if brackets are not nested, otherwise you'd need a parser rather than a regexp.

like image 187
georg Avatar answered Oct 14 '22 04:10

georg


You could split on a space asserting to the right 1 or more non whitespace chars except for square brackets and optionally match from an opening till closing square bracket followed by a whitespace boundary at the right.

[ ](?=[^\][\s]+(?:\[[^\][]*])?(?!\S))

Explanation

  • [ ] Match a space (square brackets only for clarity)
  • (?= Postive lookahead
    • [^\][\s]+ Match 1+ times any char except ] [ or a whitespace char
    • (?:\[[^\][]*])? Optinally match from [...]
    • (?!\S) A whitespace boundary to the right
  • ) Close lookahead

Regex demo

const regex = / (?=[^\][\s]+(?:\[[^\][]*])?(?!\S))/g;
[
  "tree car[tesla BMW] cat color[yellow blue] dog",
  "tree car[tesla BMW] cat xml:cat xml:color[yellow blue] dog",
  "tree:test car[tesla BMW]",
  "tree car[tesla BMW] cat color yellow blue] dog",
  "tree car[tesla BMW] cat color[yellow blue dog"
].forEach(s => console.log(s.split(regex)));
like image 38
The fourth bird Avatar answered Oct 14 '22 04:10

The fourth bird