I have a time series in pandas that looks like this (order by id):
id time value
1 0 2
1 1 4
1 2 5
1 3 10
1 4 15
1 5 16
1 6 18
1 7 20
2 15 3
2 16 5
2 17 8
2 18 10
4 6 5
4 7 6
I want downsampling time from 1 minute to 3 minutes for each group id. And value is a maximum of group (id and 3 minutes).
The output should be like:
id time value
1 0 5
1 1 16
1 2 20
2 0 8
2 1 10
4 0 6
I tried loop it take long time process.
Any idea how to solve this for large dataframe?
Thanks!
You can convert your time series to an actual timedelta, then use resample for a vectorized solution:
t = pd.to_timedelta(df.time, unit='T')
s = df.set_index(t).groupby('id').resample('3T').last().reset_index(drop=True)
s.assign(time=s.groupby('id').cumcount())
id time value
0 1 0 5
1 1 1 16
2 1 2 20
3 2 0 8
4 2 1 10
5 4 0 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With