Looping over a MultiIndex in pandas

Tags:

I have a MultiIndexed DataFrame df1, and would like to loop over it in such a way as to in each instance of the loop have a DataFrame with a regular non-hierarchical index which is the subset of df1 corresponding to the outer index entries. I.e., if i have:

FirstTable

I want to get

SecondTable

and subsequently C1, C2, etc. I also don't know what the names of these will actually be (C1, etc., just being placeholders here), so would just like to loop over the number of C_i values I have.

I have been stumbling around with iterrows and various loops and not getting any tangible results and don't really know how to proceed. I feel like a simple solution should exist but couldn't find anything that looked helpful in the documentation, probably due to my own lack of understanding.

351

asked Jul 17 '14 15:07

Charles Dillon

2 Answers

Using a modified example from here

Click to copy

In [30]: def mklbl(prefix,n):
        return ["%s%s" % (prefix,i)  for i in range(n)]
   ....: 

In [31]: columns = MultiIndex.from_tuples([('a','foo'),('a','bar'),
                                  ('b','foo'),('b','bah')],
                                   names=['lvl0', 'lvl1'])

In [33]: index = MultiIndex.from_product([mklbl('A',4),mklbl('B',2)])

In [34]: df = DataFrame(np.arange(len(index)*len(columns)).reshape((len(index),len(columns))),
               index=index,
               columns=columns).sortlevel().sortlevel(axis=1)

In [35]: df
Out[35]: 
lvl0     a         b     
lvl1   bar  foo  bah  foo
A0 B0    1    0    3    2
   B1    5    4    7    6
A1 B0    9    8   11   10
   B1   13   12   15   14
A2 B0   17   16   19   18
   B1   21   20   23   22
A3 B0   25   24   27   26
   B1   29   28   31   30

In [36]: df.loc['A0']
Out[36]: 
lvl0    a         b     
lvl1  bar  foo  bah  foo
B0      1    0    3    2
B1      5    4    7    6

In [37]: df.loc['A1']
Out[37]: 
lvl0    a         b     
lvl1  bar  foo  bah  foo
B0      9    8   11   10
B1     13   12   15   14

No looping is necessary.

You can also select these in order to return a frame (with the original MI) e.g. df.loc[['A1']]

If you want to get the values in the index:

Click to copy

In [38]: df.index.get_level_values(0).unique()
Out[38]: array(['A0', 'A1', 'A2', 'A3'], dtype=object)

174

answered Sep 17 '22 09:09

Jeff

Are you trying to do something like this?

Click to copy

for i in set(df.index):
    print df.loc[i].reset_index()

set(df.index) returns a set of unique tuples of your multi-index (hierarchical index).
df.loc[i].reset_index()... df.loc[i] of course returns a subset of your original dataframe, and the .reset_index() part will convert the index to columns

answered Sep 20 '22 09:09

John

Related questions
                            
                                Making Probability Distribution Functions (PDFs) from histograms
                            
                                Converting a very small python Decimal into a non-scientific notation string
                            
                                How can I download a PyPI package for pip installation at a later date?
                            
                                How does HAProxy achieves its speed?
                            
                                Functions access to global variables
                            
                                cv2.createTrackbar using python
                            
                                Make a Custom Class JSON serializable
                            
                                memcache.get returns wrong object (Celery, Django)
                            
                                Adding an additional index to an existing multi-index dataframe
                            
                                add a new column to an existing csv file
                            
                                error_perm: 550 Permission denied
                            
                                Pandas: decompress date range to individual dates
                            
                                Regular expression negative lookbehind of non-fixed length
                            
                                What are Constants and Literal constants?
                            
                                load csv file to numpy and access columns by name
                            
                                pandas apply function to multiple columns and multiple rows
                            
                                NTLM authentication with Scrapy for web scraping
                            
                                add user to mongodb via python
                            
                                Pandas: How could I iterate two dataframes which have exactly same format?
                            
                                Updated Bokeh to 0.5.0, now plots all previous versions of graph in one window

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Looping over a MultiIndex in pandas

Tags:

python

pandas

multi-index

Charles Dillon

People also ask

2 Answers

Jeff

John

Recent Activity

Donate For Us