How to iterate over pandas multiindex dataframe using index

Tags:

pandas

I have a data frame df which looks like this. Date and Time are 2 multilevel index

                           observation1   observation2 date          Time                              2012-11-02    9:15:00      79.373668      224               9:16:00      130.841316     477 2012-11-03    9:15:00      45.312814      835               9:16:00      123.776946     623               9:17:00      153.76646      624               9:18:00      463.276946     626               9:19:00      663.176934     622               9:20:00      763.77333      621 2012-11-04    9:15:00      115.449437     122               9:16:00      123.776946     555               9:17:00      153.76646      344               9:18:00      463.276946     212

I want to run some complex process over daily data block.

Pseudo code would look like

 for count in df(level 0 index) :      new_df = get only chunk for count      complex_process(new_df)

So, first of all, I could not find a way to access only blocks for a date

2012-11-03    9:15:00      45.312814      835               9:16:00      123.776946     623               9:17:00      153.76646      624               9:18:00      463.276946     626               9:19:00      663.176934     622               9:20:00      763.77333      621

and then send it for processing. I am doing this in for loop as I am not sure if there is any way to do it without mentioning exact value of level 0 column. I did some basic search and able to get df.index.get_level_values(0), but it returns me all the values and that causes loop to run multiple times for a day. I want to create a dataframe per day and send it for processing.

603

asked Sep 19 '14 08:09

Yantraguru

1 Answers

One easy way would be to groupby the first level of the index - iterating over the groupby object will return the group keys and a subframe containing each group.

In [136]: for date, new_df in df.groupby(level=0):      ...:     print(new_df)      ...:                          observation1  observation2 date       Time                                2012-11-02 9:15:00     79.373668           224            9:16:00    130.841316           477                      observation1  observation2 date       Time                                2012-11-03 9:15:00     45.312814           835            9:16:00    123.776946           623            9:17:00    153.766460           624            9:18:00    463.276946           626            9:19:00    663.176934           622            9:20:00    763.773330           621                      observation1  observation2 date       Time                                2012-11-04 9:15:00    115.449437           122            9:16:00    123.776946           555            9:17:00    153.766460           344            9:18:00    463.276946           212

188

answered Oct 14 '22 18:10

chrisb

Related questions
                            
                                Fail to get data on using read() of StringIO in python
                            
                                How to assert that an iterable is not empty on Unittest?
                            
                                How to use JDBC source to write and read data in (Py)Spark?
                            
                                URL Decode with Python 3
                            
                                format strings and named arguments in Python
                            
                                Object does not support item assignment error
                            
                                Unit testing a python app that uses the requests library
                            
                                pandas select from Dataframe using startswith
                            
                                What is wrong with using a bare 'except'? [duplicate]
                            
                                How do I use cache_clear() on python @functools.lru_cache
                            
                                Get all documents of a collection using Pymongo
                            
                                Exception thrown in multiprocessing Pool not detected
                            
                                Pandas merge two dataframes with different columns
                            
                                see if two files have the same content in python [duplicate]
                            
                                Impute categorical missing values in scikit-learn
                            
                                Python histogram outline
                            
                                Matplotlib: How to force integer tick labels?
                            
                                Difference between dir(…) and vars(…).keys() in Python?
                            
                                Python urllib2: Reading content body even during HTTPError exception?
                            
                                How to correctly call base class methods (and constructor) from inherited classes in Python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With