Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time differentiation in Pandas

Say I have a dataframe with several timestamps and values. I would like to measure Δ values / Δt every 2.5 seconds. Does Pandas provide any utilities for time differentiation?

                              time_stamp   values
19492   2014-10-06 17:59:40.016000-04:00  1832128                                
167106  2014-10-06 17:59:41.771000-04:00  2671048                                
202511  2014-10-06 17:59:43.001000-04:00  2019434                                
161457  2014-10-06 17:59:44.792000-04:00  1294051                                
203944  2014-10-06 17:59:48.741000-04:00   867856
like image 201
Amelio Vazquez-Reina Avatar asked Oct 07 '14 21:10

Amelio Vazquez-Reina


People also ask

What does diff () do in pandas?

The diff() method returns a DataFrame with the difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.


2 Answers

It most certainly does. First, you'll need to convert your indices into pandas date_rangeformat and then use the custom offset functions available to series/dataframes indexed with that class. Helpful documentation here. Read more here about offset aliases.

This code should resample your data to 2.5s intervals

#df is your dataframe
index = pd.date_range(df['time_stamp'])
values = pd.Series(df.values, index=index)

#Read above link about the different Offset Aliases, S=Seconds
resampled_values = values.resample('2.5S') 

resampled_values.diff() #compute the difference between each point!

That should do it.

like image 128
tyleha Avatar answered Sep 22 '22 08:09

tyleha


If you really want the time derivative, then you also need to divide by the time difference (delta time, dt) since last sample

An example:

dti = pd.DatetimeIndex([
    '2018-01-01 00:00:00',
    '2018-01-01 00:00:02',
    '2018-01-01 00:00:03'])

X = pd.DataFrame({'data': [1,3,4]}, index=dti)

X.head()
                    data
2018-01-01 00:00:00 1
2018-01-01 00:00:02 3
2018-01-01 00:00:03 4

You can find the time delta by using the diff() on the DatetimeIndex. This gives you a series of type Time Deltas. You only need the values in seconds, though

dt = pd.Series(df.index).diff().dt.seconds.values

dXdt = df.diff().div(dt, axis=0, )

dXdt.head()
                    data
2018-01-01 00:00:00 NaN
2018-01-01 00:00:02 1.0
2018-01-01 00:00:03 1.0

As you can see, this approach takes into account that there are two seconds between the first two values, and only one between the two last values. :)

like image 32
ViggoTW Avatar answered Sep 21 '22 08:09

ViggoTW