So I'm sure this is pretty trivial but I'm pretty new to python/pandas.
I want to get a certain column (Names of my measurements) of my Multiindex as a list to use it in a for loop later to name and save my plots. I'm pretty confident in getting the data I need from my dataframe but i can't figure out how to get certain columns from my index.
So actually while writing the question I kind of figured the answer out but it still seems kind of clunky. There has to be a direct command to do this. That would be my code:
a = df.index.get_level_values('File')
a = a.drop_duplicates()
a = a.values
You can access unique elements of each level of your MultiIndex
directly:
df = pd.DataFrame([['A', 'W', 1], ['B', 'X', 2], ['C', 'Y', 3],
['D', 'X', 4], ['E', 'Y', 5]])
df = df.set_index([0, 1])
a = df.index.levels[1]
print(a)
Index(['W', 'X', 'Y'], dtype='object', name=1)
To understand the information available, see how the Index
object is stored internally:
print(df.index)
MultiIndex(levels=[['A', 'B', 'C', 'D', 'E'], ['W', 'X', 'Y']],
labels=[[0, 1, 2, 3, 4], [0, 1, 2, 1, 2]],
names=[0, 1])
However, the below methods are more intuitive and better documented.
One point worth noting is you don't have to explicitly extract the NumPy array via the values
attribute. You can iterate Index
objects directly. In addition, method chaining is possible and encouraged with Pandas.
Returns an Index
object, with order preserved.
a = df.index.get_level_values(1).drop_duplicates()
# equivalently, df.index.get_level_values(1).unique()
print(a)
Index(['W', 'X', 'Y'], dtype='object', name=1)
Returns a set
. Useful for O(1) lookup, but result is unordered.
a = set(df.index.get_level_values(1))
print(a)
{'X', 'Y', 'W'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With