Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas read_csv import results in error

Tags:

python

pandas

csv

My csv is as follows (MQM Q.csv):

Date-Time,Value,Grade,Approval,Interpolation Code 
31/08/2012 12:15:00,,41,1,1 
31/08/2012 12:30:00,,41,1,1 
31/08/2012 12:45:00,,41,1,1 
31/08/2012 13:00:00,,41,1,1 
31/08/2012 13:15:00,,41,1,1 
31/08/2012 13:30:00,,41,1,1 
31/08/2012 13:45:00,,41,1,1 
31/08/2012 14:00:00,,41,1,1 
31/08/2012 14:15:00,,41,1,1

The first few lines have no "Value" entries but they start later on.

Here is my code:

import pandas as pd 
from StringIO import StringIO
Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)

I get the following error:

Traceback (most recent call last):
  File "daily.py", line 4, in <module>
    Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 443, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 228, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 533, in __init__
    self._make_engine(self.engine)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 670, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 1067, in __init__
    col_indices.append(self.names.index(u))
ValueError: 'Value' is not in list
like image 683
Sid Kwakkel Avatar asked Jun 18 '14 19:06

Sid Kwakkel


People also ask

How do I fix Python parser error?

How do I fix parsing errors? Easily: read a traceback message, rewrite your code accordingly, and re-run the program again. That's one of the reasons why Python is so awesome! It tells you what's wrong with your code in traceback messages so all you have to do is learn to read those messages.

How do you fix error Tokenizing data?

The Error tokenizing data may arise when you're using separator (for eg. comma ',') as a delimiter and you have more separator than expected (more fields in the error row than defined in the header). So you need to either remove the additional field or remove the extra separator if it's there by mistake.

What is parse error in pandas?

ParserError[source] Exception that is raised by an error encountered in parsing file contents. This is a generic error raised for errors encountered when functions like read_csv or read_html are parsing contents of a file. See also read_csv. Read CSV (comma-separated) file into a DataFrame.


1 Answers

This appears to be a bug with the csv parser, firstly this works:

df = pd.read_csv('MQM Q.csv')

also this works:

df = pd.read_csv('MQM Q.csv', usecols=['Value'])

but if I want Date-Time then it fails with the same error message as yours.

So I noticed it was utf-8 encoded and so I converted using notepad++ to ANSI and it worked, I then tried utf-8 without BOM and it also worked.

I then converted it to utf-8 (presumably there is now a BOM) and it failed with the same error as before, so I don't think you are imaging this now and this looks like a bug.

I am using python 3.3, pandas 0.14 and numpy 1.8.1

To get around this do this:

df = pd.read_csv('MQM Q.csv', usecols=[0,1], parse_dates=True, dayfirst=True, index_col=0)

This will set your index to the Date-Time column which will correctly convert to a datetimeindex.

In [40]:

df.index
Out[40]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-08-31 12:15:00, ..., 2013-11-28 10:45:00]
Length: 43577, Freq: None, Timezone: None
like image 63
EdChum Avatar answered Sep 29 '22 00:09

EdChum