Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas.read_csv "unexpected end of data" Error

Tags:

python

pandas

I'm trying to read a dataset using pd.read_csv() am getting an error. Excel can open it just fine.

reviews = pd.read_csv('br.csv') gives the error ParserError: Error tokenizing data. C error: EOF inside string starting at line 312074

reviews = pd.read_csv('br.csv', engine='python', encoding='utf-8') returns ParserError: unexpected end of data

What can I do to fix this?

Edit: This is the dataset - https://www.kaggle.com/gnanesh/goodreads-book-reviews

like image 363
Ryan Avatar asked Aug 30 '18 21:08

Ryan


People also ask

What does Error_bad_lines false do?

If error_bad_lines is False, and warn_bad_lines is True, a warning for each “bad line” will be output. (Only valid with C parser).

What is parse_dates?

parse_dates : boolean or list of ints or names or list of lists or dict, default False. boolean. If True -> try parsing the index. list of ints or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.


2 Answers

For me adding this fixed it:

error_bad_lines=False

It just skips the last line. So instead of

reviews = pd.read_csv('br.csv', engine='python', encoding='utf-8')

reviews = pd.read_csv('br.csv', engine='python', encoding='utf-8', error_bad_lines=False)

like image 79
Elise Mol Avatar answered Oct 12 '22 22:10

Elise Mol


In my case, I don't want to skip lines, since my task is required to count the number of data records in the csv file. The solution that works for me is using the Quote_None from csv library. I try this from reading on some websites that I did not remember, but it works.

To describe my case, previouly I have the error: EOF .... Then I tried using the parameter engine='python'. But that introduce another bug for next step of using the dataframe. Then I try quoting=csv.Quote_None, and it's ok now. I hope this helps

import csv    
read_file = read_csv(full_path, delimiter='~', encoding='utf-16 BE', header=0, quoting=csv.QUOTE_NONE)
like image 39
Linh Nguyen Avatar answered Oct 12 '22 23:10

Linh Nguyen