Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

javascript/regex to ignore semicolons in double quotes

I've been stumped for bit on this one - I have a string that is almost a semicolon delimited string it would be something like this:

one; two; three "four; five;six"; seven

I'd like to split this up using a regex in javascript into an array like this (e.g. ignoring any semicolons inside double quotes):

['one','two','three "four; five;six"','seven']

I've tried adapting known working CSV functions, but they seem to be able to be adapted to work with the third element ('three "four;five;six";').

It seems like a regex type of problem, but if a solution exists using more than regex, I'm certainly interested!

update: I should also note that there may be spaces before or after the semicolons in the quoted string. I've updated the example to reflect that.

like image 877
stockholmux Avatar asked Dec 21 '25 00:12

stockholmux


1 Answers

Assuming you don't allow for escaped quotes inside your quotes (e.g. "this has \"escaped quotes\" inside") then this should work:

var rx = /(?!;|$)[^;"]*(("[^"]*")[^;"]*)*/g;
var str = 'one; two; three "four;five;six"; seven';
var res = str.match(rx)
// res = ['one', ' two', ' three "four;five;six"', ' seven']

Note that you need the negative-lookahead (?!;|$) at the beginning of the regex to keep it from matching the empty string, otherwise the match method matches empty strings in front of each of the semicolons for some reason.

Update:

I think this regular expression should work with escaped quotes as well (although I'd appreciate feedback on the correctness). I've also added the extra \s in the negative-lookahead pattern to strip off whitespace after the preceding semicolon.

/(?!\s|;|$)[^;"]*("(\\.|[^\\"])*"[^;"]*)*/g
like image 69
DaoWen Avatar answered Dec 23 '25 13:12

DaoWen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!