Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to shift a pandas MultiIndex Series?

In a regular time series you can shift it back or forward in time.

e.g. for the following time series:

start = datetime(2012,1,1)
end = datetime(2012,4,1)
rng = pd.date_range(start,end)
ts = pd.Series(np.random.randn(len(rng)), index = rng)

We can shift it with:

ts.shift(2, freq="D")

How can I do this for a MultiIndex time series on just one level?

e.g. for the following MultiIndex time series:

mi = [(dt,i) for dt in rng for i in range(5)]
ts_mi = pd.Series(np.random.randn(len(mi)), index = pd.MultiIndex.from_tuples(mi))

Which might look something like:

2012-01-01  0   -0.805353
            1    1.467167
            2   -1.207204
            3    1.658394
            4    1.497559
2012-01-02  0   -0.742510
            1    0.764594
            2    0.558660
            3   -0.479370
            4    0.653849
...

Shifting it using ts_mi.shift(2, freq="D") gives:

None   -0.805353
None    1.467167
None   -1.207204
None    1.658394
None    1.497559
None   -0.742510
None    0.764594
None    0.558660
None   -0.479370
None    0.653849
None   -0.138347
None    0.357479
None   -0.919202
None    1.300977
None   -0.360398
...
like image 701
metakermit Avatar asked Oct 23 '12 12:10

metakermit


1 Answers

ts_mi.unstack().shift(2, freq='D').stack()

One can see that the output is correct:

utils.side_by_side(ts_mi, ts_mi.unstack().shift(2, freq='D').stack())
2012-01-01  0    0.481555    2012-01-03  0    0.481555
            1    0.000628                1    0.000628
            2    2.509266                2    2.509266
            3    0.021571                3    0.021571
            4   -0.539981                4   -0.539981
2012-01-02  0   -1.465450    2012-01-04  0   -1.465450
            1    0.815251                1    0.815251
            2   -1.489051                2   -1.489051
            3    0.639746                3    0.639746
            4   -0.176939                4   -0.176939
2012-01-03  0   -0.441842    2012-01-05  0   -0.441842
            1   -0.792810                1   -0.792810
            2   -0.802665                2   -0.802665
            3    1.922190                3    1.922190
            4    0.165446                4    0.165446
...                          ...                      

How it works: df.unstack() moves the values in the nested 0, 1, 2, 3, 4 to the homonym columns, and df.stack() recovers the original nested index.

EDIT: here's the side_by_side function from @Wes_McKinney

def side_by_side(*objs, **kwds):
    from pandas.core.common import adjoin
    space = kwds.get('space', 4)
    reprs = [repr(obj).split('\n') for obj in objs]
    print adjoin(space, *reprs)
like image 61
meteore Avatar answered Oct 02 '22 05:10

meteore