I'm wanting to match any instance of text in a comma-delimited list. For this, the following regular expression works great:
/[^,]+/g
(Regex101 demo).
The problem is that I'm wanting to ignore any commas which are contained within either single or double quotes and I'm unsure how to extend the above selector to allow me to do that.
Here's an example string:
abcd, efgh, ij"k,l", mnop, 'q,rs't
I'm wanting to either match the five chunks of text or match the four relevant commas (so I can retreive the data using split()
instead of match()
):
abcd
efgh
ij"k,l"
mnop
'q,rs't
Or:
abcd, efgh, ij"k,l", mnop, 'q,rs't
^ ^ ^ ^
How can I do this?
Three relevant questions exist, but none of them cater for both '
and "
in JavaScript:
"
"
Okay, so your matching groups can contain:
So this should work:
/((?:[^,"']+|"[^"]*"|'[^']*')+)/g
RegEx101 Demo
As a nice bonus, you can drop extra single-quotes inside the double-quotes, and vice versa. However, you'll probably need a state machine for adding escaped double-quotes inside double quoted strings (eg. "aa\"aa").
Unfortunately it matches the initial space as well - you'll have to the trim the matches.
Using a double lookahead to ascertain matched comma is outside quotes:
/(?=(([^"]*"){2})*[^"]*$)(?=(([^']*'){2})*[^']*$)\s*,\s*/g
(?=(([^"]*"){2})*[^"]*$)
asserts that there are even number of double quotes ahead of matching comma. (?=(([^']*"){2})*[^']*$)
does the same assertion for single quote.PS: This doesn't handle case of unbalanced, nested or escaped quotes.
RegEx Demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With