Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shift pandas dataframe down in a cyclical manner

If we have the following data:

X = pd.DataFrame({"t":[1,2,3,4,5],"A":[34,12,78,84,26], "B":[54,87,35,25,82], "C":[56,78,0,14,13], "D":[0,23,72,56,14], "E":[78,12,31,0,34]})
X

    A   B   C   D   E  t
0  34  54  56   0  78  1
1  12  87  78  23  12  2
2  78  35   0  72  31  3
3  84  25  14  56   0  4
4  26  82  13  14  34  5 

How can I shift the data in a cyclical fashion so that the next step is:

    A   B   C   D   E  t
4  26  82  13  14  34  5 
0  34  54  56   0  78  1
1  12  87  78  23  12  2
2  78  35   0  72  31  3
3  84  25  14  56   0  4

And then:

    A   B   C   D   E  t
3  84  25  14  56   0  4
4  26  82  13  14  34  5 
0  34  54  56   0  78  1
1  12  87  78  23  12  2
2  78  35   0  72  31  3

etc.

This should also shift the index values with the row.

I know of pandas X.shift(), but it wasn't making the cyclical thing.

like image 218
ishido Avatar asked Dec 08 '16 16:12

ishido


Video Answer


2 Answers

You can combine reindex with np.roll:

X = X.reindex(np.roll(X.index, 1))

Another option is to combine concat with iloc:

shift = 1
X = pd.concat([X.iloc[-shift:], X.iloc[:-shift]])

The resulting output:

    A   B   C   D   E  t
4  26  82  13  14  34  5
0  34  54  56   0  78  1
1  12  87  78  23  12  2
2  78  35   0  72  31  3
3  84  25  14  56   0  4

Timings

Using the following setup to produce a larger DataFrame and functions for timing:

df = pd.concat([X]*10**5, ignore_index=True)

def root1(df, shift):
    return df.reindex(np.roll(df.index, shift))

def root2(df, shift):
    return pd.concat([df.iloc[-shift:], df.iloc[:-shift]])

def ed_chum(df, num):
    return pd.DataFrame(np.roll(df, num, axis=0), np.roll(df.index, num), columns=df.columns)

def divakar1(df, shift):
    return df.iloc[np.roll(np.arange(df.shape[0]), shift)]

def divakar2(df, shift):
    idx = np.mod(np.arange(df.shape[0])-1,df.shape[0])
    for _ in range(shift):
        df = df.iloc[idx]
    return df

I get the following timings:

%timeit root1(df.copy(), 25)
10 loops, best of 3: 61.3 ms per loop

%timeit root2(df.copy(), 25)
10 loops, best of 3: 26.4 ms per loop

%timeit ed_chum(df.copy(), 25)
10 loops, best of 3: 28.3 ms per loop

%timeit divakar1(df.copy(), 25)
10 loops, best of 3: 177 ms per loop

%timeit divakar2(df.copy(), 25)
1 loop, best of 3: 4.18 s per loop
like image 56
root Avatar answered Nov 16 '22 02:11

root


You can use np.roll in a custom func:

In [83]:
def roll(df, num):
    return pd.DataFrame(np.roll(df,num,axis=0), np.roll(df.index, num), columns=df.columns)
​
roll(X,1)

Out[83]:
    A   B   C   D   E  t
4  26  82  13  14  34  5
0  34  54  56   0  78  1
1  12  87  78  23  12  2
2  78  35   0  72  31  3
3  84  25  14  56   0  4

In [84]:
roll(X,2)

Out[84]:
    A   B   C   D   E  t
3  84  25  14  56   0  4
4  26  82  13  14  34  5
0  34  54  56   0  78  1
1  12  87  78  23  12  2
2  78  35   0  72  31  3

Here we return a df using the rolled df array, with the index rolled also

like image 33
EdChum Avatar answered Nov 16 '22 04:11

EdChum