Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas multiindex: get level values without duplicates

So I'm sure this is pretty trivial but I'm pretty new to python/pandas.

I want to get a certain column (Names of my measurements) of my Multiindex as a list to use it in a for loop later to name and save my plots. I'm pretty confident in getting the data I need from my dataframe but i can't figure out how to get certain columns from my index.

So actually while writing the question I kind of figured the answer out but it still seems kind of clunky. There has to be a direct command to do this. That would be my code:

a = df.index.get_level_values('File')
a = a.drop_duplicates()
a = a.values
like image 518
ndrs Avatar asked Dec 13 '22 16:12

ndrs


1 Answers

index.levels

You can access unique elements of each level of your MultiIndex directly:

df = pd.DataFrame([['A', 'W', 1], ['B', 'X', 2], ['C', 'Y', 3],
                   ['D', 'X', 4], ['E', 'Y', 5]])
df = df.set_index([0, 1])

a = df.index.levels[1]

print(a)
Index(['W', 'X', 'Y'], dtype='object', name=1)

To understand the information available, see how the Index object is stored internally:

print(df.index)

MultiIndex(levels=[['A', 'B', 'C', 'D', 'E'], ['W', 'X', 'Y']],
           labels=[[0, 1, 2, 3, 4], [0, 1, 2, 1, 2]],
           names=[0, 1])

However, the below methods are more intuitive and better documented.

One point worth noting is you don't have to explicitly extract the NumPy array via the values attribute. You can iterate Index objects directly. In addition, method chaining is possible and encouraged with Pandas.

drop_duplicates / unique

Returns an Index object, with order preserved.

a = df.index.get_level_values(1).drop_duplicates()
# equivalently, df.index.get_level_values(1).unique()

print(a)
Index(['W', 'X', 'Y'], dtype='object', name=1)

set

Returns a set. Useful for O(1) lookup, but result is unordered.

a = set(df.index.get_level_values(1))

print(a)
{'X', 'Y', 'W'}
like image 127
jpp Avatar answered Jan 12 '23 02:01

jpp