List index out of range with Panda read_csv

Tags:

I'm trying to read large data (thousands of rows) through a python script from csv files which look like this:

.....
2015-11-03 20:16:28,000;63,62;
2015-11-03 20:16:29,000;63,75;
2015-11-03 20:16:30,000;63,86;
2015-11-03 20:16:31,000;64,25;

but it appears that one of the files has extra empty rows that have 196541465 blank spaces — then the code crashes when reading it with read_csv of pandas lib.

     File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 4221, in append
        elif isinstance(other, list) and not isinstance(other[0], DataFrame):
IndexError: list index out of range

I'm using the folowing command:

data = pd.read_csv(input_file,skiprows = [0],usecols=[0,1,2],delimiter=';',decimal=',', names = [ 'date','angle','Unnamed'],na_filter = False,parse_dates = [0],date_parser = reformat_date,error_bad_lines = False,skip_blank_lines=True)#,nrows = 8191)

the culprit row is the 8192'th, when limiting rows (by rows = 8191) it works just fine. I've tried many options from the doc but it doesn't seem to work! Any idea?

273

asked Jun 22 '16 09:06

Nero Ouali

Video Answer

2 Answers

I got this error because I was trying to read a CSV file that had too few headers vs. the number of columns (e.g. 10 columns, but only 8 headers. If you set index_col=False, pandas doesn't know what to do with the extra columns)

173

answered Sep 24 '22 03:09

rogueleaderr

Edited according to Mitjas comment below.

I just had the same issue and index_col = False didn't work. I had 19 columns and only 17 headers. Solved it with reading columns and headers separately and then adding the header names.

dfcolumns = pd.read_csv('file.csv',
                        nrows = 1)
df = pd.read_csv('file.csv',
                  header = None,
                  skiprows = 1,
                  usecols = list(range(len(dfcolumns.columns))),
                  names = dfcolumns.columns)

answered Sep 22 '22 03:09

Marcus Högenå Bohman

Related questions
                            
                                Align text for OCR
                            
                                How do I change the dtype in TensorFlow for a csv file?
                            
                                Monitoring django rest framework api on production server
                            
                                Attach a queue to a numpy array in tensorflow for data fetch instead of files?
                            
                                How to check for empty request.FILE in Django
                            
                                OpenCV for Python 3.5.1
                            
                                Python: Read hex from file into list?
                            
                                sum values of columns starting with the same string in pandas dataframe
                            
                                Parsing through json data for aws sns event data in python
                            
                                How to divide each element in a tuple by a single integer? [closed]
                            
                                Save pandas dataframe but conserving NA values
                            
                                Convert unicode json to normal json in python
                            
                                How to change font size in ttk.Button?
                            
                                PyCharm - can't use remote interpreter
                            
                                tflearn / tensorflow does not learn xor
                            
                                Can't install PIL
                            
                                PyCharm Cannot Run Program C:\\Anaconda\\python.exe
                            
                                AttributeError: 'Graph' object has no attribute 'cypher' in migration of data from Postgress to Neo4j(Graph Database)
                            
                                openpyxl: assign value or apply format to a range of Excel cells without iteration
                            
                                Download a file from a Flask-based Python server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

List index out of range with Panda read_csv

Tags:

python

pandas

csv