Reindexing and filling on one level of a hierarchical index in pandas

I have a pandas dataframe with a two level hierarchical index ('item_id' and 'date'). Each row has columns for a variety of metrics for a particular item in a particular month. Here's a sample:

                    total_annotations  unique_tags
date       item_id
2007-04-01 2                       30           14
2007-05-01 2                       32           16
2007-06-01 2                       36           19
2008-07-01 2                       81           33
2008-11-01 2                       82           34
2009-04-01 2                       84           35
2010-03-01 2                       90           35
2010-04-01 2                      100           36
2010-11-01 2                      105           40
2011-05-01 2                      106           40
2011-07-01 2                      108           42
2005-08-01 3                      479          200
2005-09-01 3                      707          269
2005-10-01 3                      980          327
2005-11-01 3                     1176          373
2005-12-01 3                     1536          438
2006-01-01 3                     1854          497
2006-02-01 3                     2206          560
2006-03-01 3                     2558          632
2007-02-01 3                     5650         1019

As you can see, there are not observations for all consecutive months for each item. What I want to do is reindex the dataframe such that each item has rows for each month in a specified range. Now, this is easy to accomplish for any given item. So, for item_id 99, for example:

baseDateRange = pd.date_range('2005-07-01','2013-01-01',freq='MS')
data.xs(99,level='item_id').reindex(baseDateRange,method='ffill')

But with this method, I'd have to iterate through all the item_ids, then merge everything together, which seems woefully over-complicated.

So how can I apply this to the full dataframe, ffill-ing the observations (but also the item_id index) such that each item_id has properly filled rows for all the dates in baseDateRange?

How do pandas use hierarchical indexes?

To make the column an index, we use the Set_index() function of pandas. If we want to make one column an index, we can simply pass the name of the column as a string in set_index(). If we want to do multi-indexing or Hierarchical Indexing, we pass the list of column names in the set_index().

What is use of Reindexing in pandas?

Pandas DataFrame reindex() Method The reindex() method allows you to change the row indexes, and the columns labels. ;] Note: The values are set to NaN if the new index is not the same as the old.

How do I change the Multi Level index in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

Essentially for each group you want to reindex and ffill. The apply gets passed a data frame that has the item_id and date still in the index, so reset, then set and reindex with filling. idx is your baseDateRange from above.

In [33]: df.groupby(level='item_id').apply(
      lambda x: x.reset_index().set_index('date').reindex(idx,method='ffill')).head(30)
Out[33]: 
                    item_id  annotations  tags
item_id                                       
2       2005-07-01      NaN          NaN   NaN
        2005-08-01      NaN          NaN   NaN
        2005-09-01      NaN          NaN   NaN
        2005-10-01      NaN          NaN   NaN
        2005-11-01      NaN          NaN   NaN
        2005-12-01      NaN          NaN   NaN
        2006-01-01      NaN          NaN   NaN
        2006-02-01      NaN          NaN   NaN
        2006-03-01      NaN          NaN   NaN
        2006-04-01      NaN          NaN   NaN
        2006-05-01      NaN          NaN   NaN
        2006-06-01      NaN          NaN   NaN
        2006-07-01      NaN          NaN   NaN
        2006-08-01      NaN          NaN   NaN
        2006-09-01      NaN          NaN   NaN
        2006-10-01      NaN          NaN   NaN
        2006-11-01      NaN          NaN   NaN
        2006-12-01      NaN          NaN   NaN
        2007-01-01      NaN          NaN   NaN
        2007-02-01      NaN          NaN   NaN
        2007-03-01      NaN          NaN   NaN
        2007-04-01        2           30    14
        2007-05-01        2           32    16
        2007-06-01        2           36    19
        2007-07-01        2           36    19
        2007-08-01        2           36    19
        2007-09-01        2           36    19
        2007-10-01        2           36    19
        2007-11-01        2           36    19
        2007-12-01        2           36    19

Reindexing and filling on one level of a hierarchical index in pandas

Tags:

python

pandas

moustachio

People also ask

1 Answers

Jeff

Recent Activity

Donate For Us

Reindexing and filling on one level of a hierarchical index in pandas

Tags:

python

pandas

moustachio

People also ask

1 Answers

Jeff

Related questions

Recent Activity

Donate For Us