Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas: Finding derivatives from Dataframe

Tags:

python

pandas

My Dataframe looks like this:

                     date     time   sym        ask0        ask1        ask2  
0     2017-08-21 07:00:00 07:00:00   7YR  100.812500  100.828125  100.843750   
1     2017-08-21 07:00:15 07:00:15   7YR  100.812500  100.828125  100.843750   
2     2017-08-21 07:00:30 07:00:30   7YR  100.812500  100.828125  100.843750   
3     2017-08-21 07:00:45 07:00:45   7YR  100.812500  100.828125  100.843750   
4     2017-08-21 07:01:00 07:01:00   7YR  100.812500  100.828125  100.843750   
5     2017-08-21 07:01:15 07:01:15   7YR  100.812500  100.828125  100.843750   
6     2017-08-21 07:01:30 07:01:30   7YR  100.812500  100.828125  100.843750   
7     2017-08-21 07:01:45 07:01:45   7YR  100.812500  100.828125  100.843750   
8     2017-08-21 07:02:00 07:02:00   7YR  100.812500  100.828125  100.843750   
9     2017-08-21 07:02:15 07:02:15   7YR  100.812500  100.828125  100.843750   
10    2017-08-21 07:02:30 07:02:30   7YR  100.812500  100.828125  100.843750   
11    2017-08-21 07:02:45 07:02:45   7YR  100.828125  100.843750  100.859375   
12    2017-08-21 07:03:00 07:03:00   7YR  100.812500  100.828125  100.843750   
13    2017-08-21 07:03:15 07:03:15   7YR  100.828125  100.843750  100.859375   
14    2017-08-21 07:03:30 07:03:30   7YR  100.828125  100.843750  100.859375   
15    2017-08-21 07:03:45 07:03:45   7YR  100.828125  100.843750  100.859375   
16    2017-08-21 07:04:00 07:04:00   7YR  100.828125  100.843750  100.859375   
17    2017-08-21 07:04:15 07:04:15   7YR  100.828125  100.843750  100.859375   
18    2017-08-21 07:04:30 07:04:30   7YR  100.828125  100.843750  100.859375   
19    2017-08-21 07:04:45 07:04:45   7YR  100.828125  100.843750  100.859375   
20    2017-08-21 07:05:00 07:05:00   7YR  100.828125  100.843750  100.859375   
21    2017-08-21 07:05:15 07:05:15   7YR  100.828125  100.843750  100.859375   
22    2017-08-21 07:05:30 07:05:30   7YR  100.828125  100.843750  100.859375 

I would like to calculate the derivative: d = d(ask0)/dt

where dt = 2 minutes.

So for example,

t = 2017-08-21 07:04:30 07:04:30    ask@t=100.828125 
t-1 =  2017-08-21 07:04:30 07:02:30  ask@t-1=100.812500
dask/dt = (100.828125-100.812500)/2 = 0.0078125

I know I can do this by iterating each row in the dataframe.

But I was wondering if there is a Pandas function doing this job.

like image 949
Chris Johnson Avatar asked Oct 23 '25 15:10

Chris Johnson


2 Answers

First step is to ensure date is a datetime column

df.date = pd.to_datetime(df.date)

Solution

df.set_index('date').ask0.rolling('121s').apply(lambda x: x[-1] - x[0]) / 2

date
2017-08-21 07:00:00    0.000000
2017-08-21 07:00:15    0.000000
2017-08-21 07:00:30    0.000000
2017-08-21 07:00:45    0.000000
2017-08-21 07:01:00    0.000000
2017-08-21 07:01:15    0.000000
2017-08-21 07:01:30    0.000000
2017-08-21 07:01:45    0.000000
2017-08-21 07:02:00    0.000000
2017-08-21 07:02:15    0.000000
2017-08-21 07:02:30    0.000000
2017-08-21 07:02:45    0.007812
2017-08-21 07:03:00    0.000000
2017-08-21 07:03:15    0.007812
2017-08-21 07:03:30    0.007812
2017-08-21 07:03:45    0.007812
2017-08-21 07:04:00    0.007812
2017-08-21 07:04:15    0.007812
2017-08-21 07:04:30    0.007812
2017-08-21 07:04:45    0.000000
2017-08-21 07:05:00    0.007812
2017-08-21 07:05:15    0.000000
2017-08-21 07:05:30    0.000000
Name: ask0, dtype: float64
like image 71
piRSquared Avatar answered Oct 25 '25 05:10

piRSquared


First, make sure that date is a datetime Series: df.date = to_datetime(df.date). Make that series the index of your dataframe with df.set_index('date'). Then resample the dataframe with a 15-sec step and interpolate:

resampled = d.resample('15s').interpolate()

Now all rows are at the same time distance and the row 2 minutes ago is the row shifted by 8.

resampled['deriv'] = (resampled['ask0'] - resampled['ask0'].shift(8))/2
like image 27
DYZ Avatar answered Oct 25 '25 04:10

DYZ