Say you have this MultiIndex-ed DataFrame:
df = pd.DataFrame({'co':['DE','DE','FR','FR'],                    'tp':['Lake','Forest','Lake','Forest'],                    'area':[10,20,30,40],                    'count':[7,5,2,3]}) df = df.set_index(['co','tp'])   Which looks like this:
           area  count co tp DE Lake      10      7    Forest    20      5 FR Lake      30      2    Forest    40      3   I would like to retrieve the unique values per index level. This can be accomplished using
df.index.levels[0]  # returns ['DE', 'FR] df.index.levels[1]  # returns ['Lake', 'Forest']   What I would really like to do, is to retrieve these lists by addressing the levels by their name, i.e. 'co' and 'tp'. The shortest two ways I could find looks like this:
list(set(df.index.get_level_values('co')))  # returns ['DE', 'FR'] df.index.levels[df.index.names.index('co')]  # returns ['DE', 'FR']   But non of them are very elegant. Is there a shorter way?
You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.
Pandas 0.23.0 finally introduced a much cleaner solution to this problem: the level argument to Index.unique():
In [3]: df.index.unique(level='co') Out[3]: Index(['DE', 'FR'], dtype='object', name='co')   This is now the recommended solution. It is far more efficient because it avoids creating a complete representation of the level values in memory, and re-scanning it.
I guess u want unique values in a certain level (and by level names) of a multiindex. I usually do the following, which is a bit long.
In [11]: df.index.get_level_values('co').unique() Out[11]: array(['DE', 'FR'], dtype=object) 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With