Pandas read_csv import results in error

Tags:

My csv is as follows (MQM Q.csv):

Date-Time,Value,Grade,Approval,Interpolation Code 
31/08/2012 12:15:00,,41,1,1 
31/08/2012 12:30:00,,41,1,1 
31/08/2012 12:45:00,,41,1,1 
31/08/2012 13:00:00,,41,1,1 
31/08/2012 13:15:00,,41,1,1 
31/08/2012 13:30:00,,41,1,1 
31/08/2012 13:45:00,,41,1,1 
31/08/2012 14:00:00,,41,1,1 
31/08/2012 14:15:00,,41,1,1

The first few lines have no "Value" entries but they start later on.

Here is my code:

import pandas as pd 
from StringIO import StringIO
Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)

I get the following error:

Traceback (most recent call last):
  File "daily.py", line 4, in <module>
    Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 443, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 228, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 533, in __init__
    self._make_engine(self.engine)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 670, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 1067, in __init__
    col_indices.append(self.names.index(u))
ValueError: 'Value' is not in list

683

asked Jun 18 '14 19:06

Sid Kwakkel

1 Answers

This appears to be a bug with the csv parser, firstly this works:

df = pd.read_csv('MQM Q.csv')

also this works:

df = pd.read_csv('MQM Q.csv', usecols=['Value'])

but if I want Date-Time then it fails with the same error message as yours.

So I noticed it was utf-8 encoded and so I converted using notepad++ to ANSI and it worked, I then tried utf-8 without BOM and it also worked.

I then converted it to utf-8 (presumably there is now a BOM) and it failed with the same error as before, so I don't think you are imaging this now and this looks like a bug.

I am using python 3.3, pandas 0.14 and numpy 1.8.1

To get around this do this:

df = pd.read_csv('MQM Q.csv', usecols=[0,1], parse_dates=True, dayfirst=True, index_col=0)

This will set your index to the Date-Time column which will correctly convert to a datetimeindex.

In [40]:

df.index
Out[40]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-08-31 12:15:00, ..., 2013-11-28 10:45:00]
Length: 43577, Freq: None, Timezone: None

answered Sep 29 '22 00:09

EdChum

Related questions
                            
                                Python virtualenv pip install MySQL-Python causing "RuntimeError: maximum recursion depth exceeded"
                            
                                Filtering in django rest framework
                            
                                "shebang /usr/bin/env python" invoking the wrong Python interpreter
                            
                                How to execute multiple tasks in parallel in fabric
                            
                                ctypes - references from C to python objects
                            
                                iPython: cannot import module named sklearn
                            
                                How do I debug a 'Not all temporary messages could be stored' value error in django?
                            
                                Name of variable in python and program efficiency
                            
                                Is there a way to access PythonAnywhere CPU Allowance from a script?
                            
                                PyPy file append mode
                            
                                minimizing a multivariate, differentiable function using scipy.optimize
                            
                                signal handling pika / python
                            
                                Removing consecutive occurrences from end of list python
                            
                                What is a good way to support Python 2 in a Python 3 codebase when using PyPi?
                            
                                Iterating over partitions in Python
                            
                                View runs in split mode in PyCharm
                            
                                GTK3 Dialog in Python, "enter key" on a Gtk.Entry should trigger the OK Button
                            
                                How to use mysql.connection db pool with python flask
                            
                                Django loggers - Correct output to stdout and stderr
                            
                                IPython notebook with optirun

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas read_csv import results in error

Tags:

python

pandas

csv

Sid Kwakkel

People also ask

1 Answers

EdChum

Recent Activity

Donate For Us