Is there a shorter way of dropping a column MultiIndex level (in my case, basic_amt
) except transposing it twice?
In [704]: test Out[704]: basic_amt Faculty NSW QLD VIC All All 1 1 2 4 Full Time 0 1 0 1 Part Time 1 0 2 3 In [705]: test.reset_index(level=0, drop=True) Out[705]: basic_amt Faculty NSW QLD VIC All 0 1 1 2 4 1 0 1 0 1 2 1 0 2 3 In [711]: test.transpose().reset_index(level=0, drop=True).transpose() Out[711]: Faculty NSW QLD VIC All All 1 1 2 4 Full Time 0 1 0 1 Part Time 1 0 2 3
To reset the index in pandas, you simply need to chain the function . reset_index() with the dataframe object. On applying the . reset_index() function, the index gets shifted to the dataframe as a separate column.
We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.
To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly. We have used the Multiindex. from_tuples() is used to create indexes column-wise.
Use DataFrame.reset_index() function We can use DataFrame. reset_index() to reset the index of the updated DataFrame. By default, it adds the current row index as a new column called 'index' in DataFrame, and it will create a new row index as a range of numbers starting at 0.
Another solution is to use MultiIndex.droplevel
with rename_axis
(new in pandas
0.18.0
):
import pandas as pd cols = pd.MultiIndex.from_arrays([['basic_amt']*4, ['NSW','QLD','VIC','All']], names = [None, 'Faculty']) idx = pd.Index(['All', 'Full Time', 'Part Time']) df = pd.DataFrame([(1,1,2,4), (0,1,0,1), (1,0,2,3)], index = idx, columns=cols) print (df) basic_amt Faculty NSW QLD VIC All All 1 1 2 4 Full Time 0 1 0 1 Part Time 1 0 2 3 df.columns = df.columns.droplevel(0) #pandas 0.18.0 and higher df = df.rename_axis(None, axis=1) #pandas bellow 0.18.0 #df.columns.name = None print (df) NSW QLD VIC All All 1 1 2 4 Full Time 0 1 0 1 Part Time 1 0 2 3 print (df.columns) Index(['NSW', 'QLD', 'VIC', 'All'], dtype='object')
If you need both column names, use list
comprehension:
df.columns = ['_'.join(col) for col in df.columns] print (df) basic_amt_NSW basic_amt_QLD basic_amt_VIC basic_amt_All All 1 1 2 4 Full Time 0 1 0 1 Part Time 1 0 2 3 print (df.columns) Index(['basic_amt_NSW', 'basic_amt_QLD', 'basic_amt_VIC', 'basic_amt_All'], dtype='object')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With