I have a CSV file (which I didn't design and I can't change now nor will I ever be able to change it) that contains lines like the following:
"Surname, Firstname", yes, no, somestring, whatever, etc
As you can see here, the first ,
is not a comma on which I'd want to split the string. Notice that this particular comma is enclosed within the quotation marks.
Because of this, a simple string.split(',')
obviously won't work, as it would give me an array of length 7 for the above string instead of 6.
Is there a way to get around this? I was thinking of using regex to split the string instead but I'm not competent enough in regex to think of a pattern that would only split on commas that are not enclosed inside quotation marks.
I can think of ugly, hacky ways to do it by reading each string char by char but this would have to be a last resort as I'm sure there's a better way to do it!
Use str. split() to convert a comma-separated string to a list. Call str. split(sep) with "," as sep to convert a comma-separated string into a list.
You can use the Python string split() function to split a string (by a delimiter) into a list of strings. To split a string by comma in Python, pass the comma character "," as a delimiter to the split() function. It returns a list of strings resulting from splitting the original string on the occurrences of "," .
Python split() method splits the string into a comma separated list. It separates string based on the separator delimiter. This method takes two parameters and both are optional.
You can handle this easily by using the TextFieldParser class. Just set HasFieldsEnclosedInQuotes
to true.
I would suggest using a CSV parser library - there are other cases that you wouldn't have thought of (new line as part of a quoted field).
The VisualBasic
namespace has a nice library that can help - the TextFieldParser
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With