How do I convert an existing dataframe with single-level columns to have hierarchical index columns (MultiIndex)?
Example dataframe:
In [1]:
import pandas as pd
from pandas import Series, DataFrame
df = DataFrame(np.arange(6).reshape((2,3)),
index=['A','B'],
columns=['one','two','three'])
df
Out [1]:
one two three
A 0 1 2
B 3 4 5
I'd have thought that reindex() would work, but I get NaN's:
In [2]:
df.reindex(columns=[['odd','even','odd'],df.columns])
Out [2]:
odd even odd
one two three
A NaN NaN NaN
B NaN NaN NaN
Same if I use DataFrame():
In [3]:
DataFrame(df,columns=[['odd','even','odd'],df.columns])
Out [3]:
odd even odd
one two three
A NaN NaN NaN
B NaN NaN NaN
This last approach actually does work if I specify df.values:
In [4]:
DataFrame(df.values,index=df.index,columns=[['odd','even','odd'],df.columns])
Out [4]:
odd even odd
one two three
A 0 1 2
B 3 4 5
What's the proper way to do this? Why does reindex() give NaN's?
We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.
We can set a specific column or multiple columns as an index in pandas DataFrame. Create a list of column labels to be used to set an index. We need to pass the column or list of column labels as input to the DataFrame. set_index() function to set it as an index of DataFrame.
A MultiIndex (also known as a hierarchical index) DataFrame allows you to have multiple columns acting as a row identifier and multiple rows acting as a header identifier. With MultiIndex, you can do some sophisticated data analysis, especially for working with higher dimensional data.
You were close, just set the columns directly to a new (equal sized) index-like (which if its a list-of-list will convert to a multi-index)
In [8]: df
Out[8]:
one two three
A 0 1 2
B 3 4 5
In [10]: df.columns = [['odd','even','odd'],df.columns]
In [11]: df
Out[11]:
odd even odd
one two three
A 0 1 2
B 3 4 5
Reindex will reorder / filter the existing index. The reason you get all nans is you are saying, hey find the existing columns that match this new index; none match, so that's what you get
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With