Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex find comma not inside quotes

Tags:

regex

I'm checking line by line in C#

Example data:

bob jones,123,55.6,,,"Hello , World",,0
jim neighbor,432,66.5,,,Andy "Blank,,1
john smith,555,77.4,,,Some value,,2

Regex to pick commas outside of quotes doesn't resolve second line, it's the closest.

like image 927
Chris Hayes Avatar asked Jan 14 '14 03:01

Chris Hayes


2 Answers

Stand back and be amazed!


Here is the regex you seek:

(?!\B"[^"]*),(?![^"]*"\B)


Here is a demonstration:

regex101 demo


  • It does not match the second line because the " you inserted does not have a closing quotation mark.
  • It will not match values like so: ,r"a string",10 because the letter on the edge of the " will create a word boundary, rather than a non-word boundary.

Alternative version

(".*?,.*?"|.*?(?:,|$))

This will match the content and the commas and is compatible with values that are full of punctuation marks

regex101 demo

like image 104
Vasili Syrakis Avatar answered Oct 26 '22 03:10

Vasili Syrakis


The below regex is for parsing each fields in a line, not an entire line

Apply the methodical and desperate regex technique: Divide and conquer

Case: field does not contain a quote

  • abc,
  • abc(end of line)

[^,"]*(,|$)

Case: field contains exactly two quotes

  • abc"abc,"abc,
  • abc"abc,"abc(end of line)

[^,"]*"[^"]*"[^,"]*(,|$)

Case: field contains exactly one quote

  • abc"abc(end of line)
  • abc"abc, (and that there's no quote before the end of this line)

[^,"]*"[^,"]$

[^,"]*"[^"],(?!.*")

Now that we have all the cases, we then '|' everything together and enjoy the resultant monstrosity.

like image 33
twj Avatar answered Oct 26 '22 04:10

twj