Old pre-0.17 pandas.read_csv behavior of `header=True` for inferring header row?

Tags:

How did old pre-0.17 versions of pandas read_csv() interpret passing a boolean header=True/False for inferring the header row?

I have CSV data with header:

col1;col2;col3
1.0;10.0;100.0
2.0;20.0;200.0
3.0;30.0;300.0

If read with `header=True`

i.e. df = pandas.read_csv('test.csv', sep=';', header=True),

that gives the following data-frame:

   1.0  10.0  100.0
0    2    20    200
1    3    30    300

It means that pandas used the second row ("row 1") for column names (the names inferred are '1.0', '10.0' and '100.0').

whereas if read with `header=False`

df = pandas.read_csv('test.csv', sep=';', header=False)

gives the following:

   col1  col2  col3
0     1    10   100
1     2    20   200
2     3    30   300

Which means that pandas used the first row ("row 0") as header in spite on the fact that I wrote explicitly that there is no header.

This behaviour is not intuitive to me. Can somebody explain what is happening?

696

asked Sep 23 '15 10:09

Roman

1 Answers

You are telling pandas what line is your header line, by passing False this evaluates to 0 which is why it reads in the first line as the header as expected, when you pass True it evaluates to 1 so it reads the second line, if you passed None then it thinks there is no header row and will auto generated ordinal values.

In [17]:    
import io
import pandas as pd
t="""col1;col2;col3
1.0;10.0;100.0
2.0;20.0;200.0
3.0;30.0;300.0"""
print('False:\n', pd.read_csv(io.StringIO(t), sep=';', header=False))
print('\nTrue:\n', pd.read_csv(io.StringIO(t), sep=';', header=True))
print('\nNone:\n', pd.read_csv(io.StringIO(t), sep=';', header=None))

False:
    col1  col2  col3
0     1    10   100
1     2    20   200
2     3    30   300

True:
    1.0  10.0  100.0
0    2    20    200
1    3    30    300

None:
       0     1      2
0  col1  col2   col3
1   1.0  10.0  100.0
2   2.0  20.0  200.0
3   3.0  30.0  300.0

UPDATE

Since version 0.17.0 this will now raise a TypeError

118

answered Oct 06 '22 01:10

EdChum

Related questions
                            
                                Writing functions that accept both 1-D and 2-D numpy arrays?
                            
                                Catching exceptions in django templates
                            
                                Stackless in PyPy and PyPy + greenlet - differences
                            
                                git cannot execute python-script as hook
                            
                                How to stop a python socket.accept() call?
                            
                                Conditional shebang line for different versions of Python
                            
                                OpenCV - imread(), imwrite() increases the size of png?
                            
                                Using methods defined in __init__.py within the module
                            
                                Combining websockets and WSGI in a python app
                            
                                Load Excel file into numpy 2D array
                            
                                How to transmit Android real-time sensor data to computer?
                            
                                python imaging library: Can I simply fill my image with one color?
                            
                                Why Python need rich comparison?
                            
                                sample weights in scikit-learn broken in cross validation
                            
                                sklearn.ensemble.AdaBoostClassifier cannot accecpt SVM as base_estimator?
                            
                                What is the right way to save\load models in Spark\PySpark
                            
                                pymssql windows authentication
                            
                                Writing a .CSV file in Python that works for both Python 2.7+ and Python 3.3+ in Windows
                            
                                Importing cython function: AttributeError: 'module' object has no attribute 'fun'
                            
                                sys_platform is not defined x64 Windows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Old pre-0.17 pandas.read_csv behavior of `header=True` for inferring header row?

Tags:

python

pandas

csv

header

If read with `header=True`

whereas if read with `header=False`

Roman

People also ask

1 Answers

EdChum

Recent Activity

Donate For Us

Old pre-0.17 pandas.read_csv behavior of `header=True` for inferring header row?

Tags:

python

pandas

csv

header

If read with header=True

whereas if read with header=False

Roman

People also ask

1 Answers

EdChum

Related questions

Recent Activity

Donate For Us

If read with `header=True`

whereas if read with `header=False`