I am trying to read this dataset from Kaggle: Amazon sales rank data for print and kindle books
The file amazon_com_extras.csv
has a column named "Title" that sometimes contains a comma ',' so all the fields in this .csv are enclosed by quotation marks:
"ASIN","GROUP","FORMAT","TITLE","AUTHOR","PUBLISHER"
"022640014X","book","hardcover","The Diversity Bargain: And Other Dilemmas of Race, Admissions, and Meritocracy at Elite Universities","Natasha K. Warikoo","University Of Chicago Press"
I have read other questions related to this problem but none of them solve it. For example, I have tried:
df = pd.read_csv("amazon_com_extras.csv",engine="python",sep=',')
df = pd.read_csv("amazon_com_extras.csv",engine="python",sep=',',quotechar='"')
But nothing seems to work. I am using Python 3.7.2 and pandas 0.24.1.
This is happening to you because there are fields inside the document that contain unescaped quotes inside the quoted text.
I am not aware of a way to instruct the csv parser to handle that without preprocessing.
If you don't care about those columns, you can use
pd.read_csv("amazon_com_extras.csv", engine="python", sep=',', quotechar='"', error_bad_lines=False)
That will disable the Exception from being raised, but it will remove the affected lines (you will see that in the console).
An example of such a line:
"1405246510","book","hardcover",""Hannah Montana" Annual 2010","Unknown","Egmont Books Ltd"
Notice the quotes.
Instead, a more standard dialect of csv would have rendered:
1405246510,"book","hardcover","""Hannah Montana"" Annual 2010","Unknown","Egmont Books Ltd"
You can, for example, load the file with Libreoffice and re-save it as CSV again to get a working CSV dialect or use other preprocessing techniques.
The problem is that pandas
treats the char "
for queting, and expects "
after every "
in a cell, which doesn't happen in this csv
.
To make pandas
not treat it as a quoting mark, pass the parameter quoting=3
inside the pd.read_csv
function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With