I have an existing DataFrame that looks like this:
1 | 1 | 1 | 2 | 2 | 2 | 2
--------------------------------------------------------
| abc | def | ghi | jkl | mno | pqr | stu
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
I've been trying this for sometime, but no success.
The repeated ones and twos are already a one level MultiIndex. I know that if I add another level they will merge together, but having a hard time transforming that first row into the second level of the MultiIndex.
Is there a simple way of doing this?
desired output:
1 | 2
| abc | def | ghi | jkl | mno | pqr | stu
--------------------------------------------------------
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
| 1.00 | 2.00 | 3.00 | 4.00 | 5.00 | 6.00 | 7.00
any help would be very appreciated! Thanks
The solution proposed by Jezrael requires some corrections:
df.columns
and df.iloc[0]
should be together the first
argument of from_arrays
, not two separate arguments.
The source of the second level of MultiIndex (df.iloc[0]) should be supplemented with .values. Otherwise this MultiIndex level inherits name (0) - the index value of row 0.
The resulting MultiIndex should be substituted to df.columns
,
not to the whole df
.
So the whole solution should be:
df.columns = pd.MultiIndex.from_arrays([df.columns, df.iloc[0].values])
df = df.iloc[1:]
I think you need MultiIndex.from_arrays
and then filter out first row by DataFrame.iloc
with indexing:
df = pd.MultiIndex.from_arrays(df.columns, df.iloc[0])
df = df.iloc[1:]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With