Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: I'm unable to set second row as column headers

Tags:

python

pandas

I have an .xlsx file whose format is similar to... (Note that the first row is descriptive and not meant to be the column headers. Headers are on row 2)

SHEET SUBJECT, Listings for 2010,,,,
Date, Name, Name_2, Abr, Number,         <--- I want this as column headers
12/01/2010, Company Name, Somecity, Chi, 36,
12/02/2010, Company Name, Someothercity, Nyc, 156,

So when I do this_df = pd.read_excel('filename.xlsx') I get SHEET SUBJECT and Listings for 2010 followed by a series of unnamed column headers. Expected, not what I want.

And when I do this_df.columns = this_df.iloc[1], assuming I'll get the column headers set from the row at index 1, it instead gives me the data values from the row at index 2.

What am I missing? Thanks.

like image 824
Layne Avatar asked Dec 08 '22 14:12

Layne


1 Answers

Simply specify the row index of the header when you read the Excel file:

pd.read_excel('filename.xlsx', header = 1)
like image 80
rahlf23 Avatar answered Dec 10 '22 03:12

rahlf23