Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting string on commas when data can contain commas

Tags:

string

c#

regex

I have a CSV file (which I didn't design and I can't change now nor will I ever be able to change it) that contains lines like the following:

"Surname, Firstname", yes, no, somestring, whatever, etc

As you can see here, the first , is not a comma on which I'd want to split the string. Notice that this particular comma is enclosed within the quotation marks.

Because of this, a simple string.split(',') obviously won't work, as it would give me an array of length 7 for the above string instead of 6.

Is there a way to get around this? I was thinking of using regex to split the string instead but I'm not competent enough in regex to think of a pattern that would only split on commas that are not enclosed inside quotation marks.

I can think of ugly, hacky ways to do it by reading each string char by char but this would have to be a last resort as I'm sure there's a better way to do it!

like image 394
AndrewC Avatar asked Jan 18 '11 18:01

AndrewC


People also ask

How do you split a string to a list using a comma delimiter?

Use str. split() to convert a comma-separated string to a list. Call str. split(sep) with "," as sep to convert a comma-separated string into a list.

How do you split a string with a comma delimiter in Python?

You can use the Python string split() function to split a string (by a delimiter) into a list of strings. To split a string by comma in Python, pass the comma character "," as a delimiter to the split() function. It returns a list of strings resulting from splitting the original string on the occurrences of "," .

How do you split a word with a comma in Python?

Python split() method splits the string into a comma separated list. It separates string based on the separator delimiter. This method takes two parameters and both are optional.


2 Answers

You can handle this easily by using the TextFieldParser class. Just set HasFieldsEnclosedInQuotes to true.

like image 115
Reed Copsey Avatar answered Sep 30 '22 12:09

Reed Copsey


I would suggest using a CSV parser library - there are other cases that you wouldn't have thought of (new line as part of a quoted field).

The VisualBasic namespace has a nice library that can help - the TextFieldParser.

like image 23
Oded Avatar answered Sep 30 '22 14:09

Oded