Below is my code to extract data from a csv file (I got the file from dumpped mysql).
data = csv.reader(f, delimiter=',', quotechar='"')
After a few tests, I found that my code above has one big problem. it can't extract data such below:
"25","Mike Ross","Tennok\"","NO"
Any idea to fix this? TQ.
Double-quote escape characters There are 2 accepted ways of escaping double-quotes in a CSV file. One is using a 2 consecutive double-quotes to denote 1 literal double-quote in the data. The alternative is using a backslash and a single double-quote.
The data value with a comma character that is part of the data is enclosed in double quotes. The double quotes that are part of the data are escaped with a double quote even though the field value is enclosed in double quotes. Note: In CSV mode, all characters are significant.
By default, the escape character is a " (double quote) for CSV-formatted files. If you want to use a different escape character, use the ESCAPE clause of COPY , CREATE EXTERNAL TABLE or the hawq load control file to declare a different escape character.
quotechar specifies the character used to surround fields that contain the delimiter character. The default is a double quote ( ' " ' ). escapechar specifies the character used to escape the delimiter character, in case quotes aren't used.
The csv
module expects the quote character to be doubled up by default to indicate it's a literal "
, so it'll incorrectly delimit the fields...
data = csv.reader(f, delimiter=',', quotechar='"')
# ['25', 'Mike Ross', 'Tennok\\",NO"']
Use escapechar
to over-ride this behaviour:
data = csv.reader(f, delimiter=',', quotechar='"', escapechar='\\')
# ['25', 'Mike Ross', 'Tennok"', 'NO']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With