Trailing delimiter confuses pandas read_csv

Tags:

A csv (comma delimited) file, where lines have an extra trailing delimiter, seems to confuse pandas.read_csv. (The data file is [1])

It treats the extra delimiter as if there's an extra column. So there's one more column than what headers require. Then pandas.read_csv takes the first column as row labels. The overall effect is that columns and headers are not aligned any more - the first column becomes row labels, the second column is named by first header, etc.

It is quite annoying. Any idea how to tell pandas.read_csv do the right thing? I couldn't find one.

Great book, BTW.

[1]: 2012 FEC Election Database from chapter 9 of the book Python for Data Analysis

384

asked Dec 05 '12 09:12

edwardw

1 Answers

For everyone who is still finding this. Wes wrote a blogpost about this. The problem if there is one value too many in the row it is treated as the rows name.

This behaviour can be changed by setting index_col=False as an option to read_csv.

answered Oct 14 '22 13:10

k-nut

Related questions
                            
                                How to Model a Foreign Key in a Reusable Django App?
                            
                                "NameError: name '' is not defined" after user input in Python [duplicate]
                            
                                None in boost.python
                            
                                How to log python program activity in Mac OS X
                            
                                2d convolution using python and numpy
                            
                                Why doesn't Python's `re.split()` split on zero-length matches?
                            
                                mysql LOAD DATA INFILE with auto-increment primary key
                            
                                Fetching just the Key/id from a ReferenceProperty in App Engine
                            
                                Is there a way to force lxml to parse Unicode strings that specify an encoding in a tag?
                            
                                Haystack in INSTALLED_APPS results in Error: cannot import name openProc
                            
                                ElementTree's iter() equivalent in Python2.6
                            
                                More elegant way to create a 2D matrix in Python [duplicate]
                            
                                Writing complex custom metadata on images through python
                            
                                Colormap for errorbars in x-y scatter plot using matplotlib
                            
                                Python pass tzinfo to naive datetime without pytz
                            
                                Does anyone have any examples of using OpenCV with python for descriptor extraction?
                            
                                subprocess.call env var
                            
                                How to get file name of logging.FileHandler in Python?
                            
                                class method __instancecheck__ does not work
                            
                                How can I pass kwargs in URL in django

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Trailing delimiter confuses pandas read_csv

Tags:

python

pandas

csv

delimiter

numpy

edwardw

People also ask

1 Answers

k-nut

Recent Activity

Donate For Us