Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove header row in Excel using pandas

I have an Excel file with merge header that I read as dataframe using pandas. It looks like this after pd.read_excel():

Unnamed: 0     Pair    Unnamed: 1      Type      ...  Unnamed: 23
cabinet_name   group     pair          caller_id ...  result
value1         value1    value1        value1    ...  value1
value2         value2    value2        value2    ...  value2

So it's like I have two header rows. One is the row with Unnamed and the other is my desired header row.

This is my desired output:

cabinet_name   group     pair          caller_id ...  result
value1         value1    value1        value1    ...  value1
value2         value2    value2        value2    ...  value2

I am trying to remove the row with Unnamed:

df.drop(df.index[[0]])

and also using header=None in pd.read_excel('file.xlsx, header=None)'

But all of what I found did not return my expected output. I searched on how to delete rows with Unnamed but all I found was deleting columns.

I also tried

df.drop(df.head(0))

but it returned me:

KeyError: '[\'Unnamed: 0\' \'Pair'\ ... \'Unnamed: 23\']'

Any best way to do it?

like image 624
Ricky Aguilar Avatar asked Aug 30 '25 17:08

Ricky Aguilar


1 Answers

I believe you need skip first row by parameters skiprows=1 or header=1 and then remove all only NaNs columns:

df = (pd.read_excel('UF_AGT702-M.xlsx', skiprows=2, sheetname='Report')
        .dropna(how='all', axis=1))
like image 127
jezrael Avatar answered Sep 02 '25 14:09

jezrael