Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problem due to double quote while parsing csv.

Tags:

python

csv

I have csv file in the follwing format,

"1";"A";"A:"61 B & BA";"C"

Following is my code to read csv file,

with open(path, 'rb') as f:
    reader = csv.reader(f, delimiter = ';', quotechar = '"')
    for row in reader:
        print row

The problem is, it breaks row in 5 fields,

['1', 'A', 'A:61 B &amp', ' BA', 'C']

Whereas I was expecting my output to be,

['1', 'A', 'A:61 B & BA', 'C']

When I remove double quote before 61 B in the csv file, I get output as,

['1', 'A', 'A:61 B & BA', 'C'] which is perfectly fine, but why is double quote in the middle of the field is causing problem even though delimiter and quotechar has been defined?

like image 934
Rohita Khatiwada Avatar asked Feb 12 '26 09:02

Rohita Khatiwada


2 Answers

Your csv file is invalid. If a quote occurs inside a (quoted) string, it must be escaped by doubling it.

"1";"A";"A:""61 B & BA";"C"

would result in

['1', 'A', 'A:"61 B & BA', 'C']

How should the CSV module guess the difference between quotes that delimit an item and quotes within the item?

like image 172
Tim Pietzcker Avatar answered Feb 15 '26 00:02

Tim Pietzcker


I suspect the double-quote should be replaced by ".

like image 42
BlueMonkMN Avatar answered Feb 15 '26 00:02

BlueMonkMN