Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas Error tokenizing data

Tags:

python

pandas

csv

I'm trying to use pandas to manipulate a .csv file but I get this error:

pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 12

I have tried to read the pandas docs, but found nothing.

My code is simple:

path = 'GOOG Key Ratios.csv' #print(open(path).read()) data = pd.read_csv(path) 

How can I resolve this? Should I use the csv module or another language ?

File is from Morningstar

like image 408
abuteau Avatar asked Aug 04 '13 01:08

abuteau


People also ask

What is error Tokenizing data in Python?

errors. ParserError: Error tokenizing data is raised by the pandas parser when reading csv files into pandas DataFrames. Additionally, we showcased how to deal with the error by fixing the errors or typos in the data file itself, or by specifying the appropriate line terminator.

What does error Tokenizing data mean?

The Error tokenizing data may arise when you're using separator (for eg. comma ',') as a delimiter and you have more separator than expected (more fields in the error row than defined in the header). So you need to either remove the additional field or remove the extra separator if it's there by mistake.

What is parser error in pandas?

ParserError[source] Exception that is raised by an error encountered in parsing file contents. This is a generic error raised for errors encountered when functions like read_csv or read_html are parsing contents of a file. See also read_csv. Read CSV (comma-separated) file into a DataFrame.


1 Answers

you could also try;

data = pd.read_csv('file1.csv', on_bad_lines='skip') 

Do note that this will cause the offending lines to be skipped.

like image 82
richie Avatar answered Sep 28 '22 16:09

richie