I have the following data frame:
| col0 |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| ... |
| 1000 |
I'd like roll col0 into a data frame with a window size of 5, so the outcome would be like this:
| col0 | col1 | col2 | ... | col995 |
|---|---|---|---|---|
| 1 | 2 | 3 | ... | 996 |
| 2 | 3 | 4 | ... | 997 |
| 3 | 4 | 5 | ... | 998 |
| 4 | 5 | 6 | ... | 999 |
| 5 | 6 | 7 | ... | 1000 |
I've tried using loops and "iloc" which would produce correct results, but as the original data frame gets much longer, it would take too long to finish. To complete 10,000, it'd take almost 2 minutes, 20,000 almost 10 minutes, and so on... Is there any way to do it faster, more efficiently in Python?
Use numpy.lib.stride_tricks.sliding_window_view and transpose (T):
from numpy.lib.stride_tricks import sliding_window_view as swv
out = pd.DataFrame(swv(df['col0'], 5).T).add_prefix('col')
Output:
col0 col1 col2 col3 col4 col5 col6 col7 col8 col9 ... col986 \
0 1 2 3 4 5 6 7 8 9 10 ... 987
1 2 3 4 5 6 7 8 9 10 11 ... 988
2 3 4 5 6 7 8 9 10 11 12 ... 989
3 4 5 6 7 8 9 10 11 12 13 ... 990
4 5 6 7 8 9 10 11 12 13 14 ... 991
col987 col988 col989 col990 col991 col992 col993 col994 col995
0 988 989 990 991 992 993 994 995 996
1 989 990 991 992 993 994 995 996 997
2 990 991 992 993 994 995 996 997 998
3 991 992 993 994 995 996 997 998 999
4 992 993 994 995 996 997 998 999 1000
[5 rows x 996 columns]
Reproducible input:
N = 1000
df = pd.DataFrame({'col0': range(1, N+1)})
Timing for 100k rows:
25 ms ± 402 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
You can use sliding_window_view for this:
import pandas as pd
from numpy.lib.stride_tricks import sliding_window_view
df = pd.DataFrame({'Col0': range(1,1001)})
data = sliding_window_view(df['Col0'], 5).T
df_new = pd.DataFrame(data,
columns=[f'Col{i}' for i in range(data.shape[1])])
print(df_new)
Col0 Col1 Col2 Col3 Col4 ... Col991 Col992 Col993 Col994 Col995
0 1 2 3 4 5 ... 992 993 994 995 996
1 2 3 4 5 6 ... 993 994 995 996 997
2 3 4 5 6 7 ... 994 995 996 997 998
3 4 5 6 7 8 ... 995 996 997 998 999
4 5 6 7 8 9 ... 996 997 998 999 1000
[5 rows x 996 columns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With