I have a Dataframe with a pandas MultiIndex:
In [1]: import pandas as pd In [2]: multi_index = pd.MultiIndex.from_product([['CAN','USA'],['total']],names=['country','sex']) In [3]: df = pd.DataFrame({'pop':[35,318]},index=multi_index) In [4]: df Out[4]: pop country sex CAN total 35 USA total 318
Then I remove some rows from that DataFrame:
In [5]: df = df.query('pop > 100') In [6]: df Out[6]: pop country sex USA total 318
But when I consult the MutliIndex, it still has both countries in its levels.
In [7]: df.index.levels[0] Out[7]: Index([u'CAN', u'USA'], dtype='object')
I can fix this myself in a rather strange way:
In [8]: idx_names = df.index.names In [9]: df = df.reset_index(drop=False) In [10]: df = df.set_index(idx_names) In [11]: df Out[11]: pop country sex USA total 318 In [12]: df.index.levels[0] Out[12]: Index([u'USA'], dtype='object')
But this seems rather messy. Is there a better way I'm missing?
DataFrame - set_index() function The set_index() function is used to set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays of the correct length. The index can replace the existing index or expand on it.
reset_index() function to reset the index of the given series object and also we will be dropping the original index labels. Output : As we can see in the output, the Series. reset_index() function has reset the index of the given Series object to default.
From version pandas 0.20.0+
use MultiIndex.remove_unused_levels
:
print (df.index) MultiIndex(levels=[['CAN', 'USA'], ['total']], labels=[[1], [0]], names=['country', 'sex']) df.index = df.index.remove_unused_levels() print (df.index) MultiIndex(levels=[['USA'], ['total']], labels=[[0], [0]], names=['country', 'sex'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With