So I have a pandas dataFrame and I would like to turn two rows into multiple headers. so from
1 A | A | B | B
2 C | D | C | D
3 cat| dog|mouse|goose
to
A | B
C | D | C | D
1 cat| dog|mouse|goose
I found
df.columns = df.iloc[0]
to work for 1 row but i want multiple headers from the first and second row Thanks in advance!
To add multiple headers, we need to create a list of lists of headers and use it to rename the columns of the dataframe.
To promote the first row to column headers, select Home > Use First Row As Headers. To demote column headers to the first row, select Home, select the arrow next to Use First Row As Headers, and then select Use Headers as First Row.
A CSV file can have multiple rows after the header row. Each row corresponds to an artifact that will be created.
Create MultiIndex
by assign first and second row, last filter out first rows by positions by DataFrame.iloc
:
df.columns = [df.iloc[0].values, df.iloc[1].values]
df = df.iloc[2:].reset_index(drop=True)
print (df)
A B
C D C D
0 cat dog mouse goose
But better if create DataFrame from file use parameters header=[0,1]
in read_csv
:
df = pd.read_csv(file, header=[0,1])
Sample:
import pandas as pd
temp=u"""A,A,B,B
C,D,C,D
cat,dog,mouse,goose"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=[0,1])
print (df)
A B
C D C D
0 cat dog mouse goose
Using MultiIndex
create it , then assign it back
df.columns=pd.MultiIndex.from_arrays([df.iloc[0],df.iloc[1]])
yourdf=df.iloc[2:].reset_index(drop=True)
yourdf
Out[52]:
1 A B
2 C D C D
0 cat dog mouse goose
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With