Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to iterate over MultiIndex levels in Pandas?

I often have MultiIndex indices and I'd like to iterate over groups where higher level indices are equal. It basically looks like

from random import choice
import pandas as pd
N = 100
df = pd.DataFrame([choice([1, 2, 3]) for _ in range(N)],
                  columns=["A"],
                  index=pd.MultiIndex.from_tuples([(choice("ab"), choice("cd"), choice("de")) 
                                                   for _ in range(N)]))

for idx in zip(df.index.get_level_values(0), df.index.get_level_values(1)):
    df_select = df.ix[idx]

Is there a way to do the for loop iteration more neatly?

like image 721
Gerenuk Avatar asked Dec 07 '15 17:12

Gerenuk


People also ask

How do you iterate through a DataFrame using an index?

Using DataFrame.iterrows() is used to iterate over DataFrame rows. This returns (index, Series) where the index is an index of the Row and Series is data or content of each row. To get the data from the series, you should use the column name like row["Fee"] .

How do I convert MultiIndex to single index in pandas?

To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.


2 Answers

Use groupby. The index of the df_select view includes the first two level values, but otherwise is similar to your example.

for idx, df_select in df.groupby(level=[0, 1]):
    ...
like image 170
Mzzzzzz Avatar answered Sep 21 '22 22:09

Mzzzzzz


Alternatively to groupby logic you can use a lambda function, which has the advantage of not having to specify the number of levels, i.e. it will pick all levels except the very last one:

for idx in df.index.map(lambda x: x[:-1]):
 df_select=df.ix[idx]
like image 37
rstreppa Avatar answered Sep 23 '22 22:09

rstreppa