I have data that I'm importing from an hdf5 file. So, it comes in looking like this:
import pandas as pd
tmp=pd.Series([1.,3.,4.,3.,5.],['2016-06-27 23:52:00','2016-06-27 23:53:00','2016-06-27 23:54:00','2016-06-27 23:55:00','2016-06-27 23:59:00'])
tmp.index=pd.to_datetime(tmp.index)
>>>tmp
2016-06-27 23:52:00 1.0
2016-06-27 23:53:00 3.0
2016-06-27 23:54:00 4.0
2016-06-27 23:55:00 3.0
2016-06-27 23:59:00 5.0
dtype: float64
I would like to find the local slope of the data. If I just do tmp.diff() I do get the local change in value. But, I want to get the change in value per second (time derivative) I would like to do something like this, but this is the wrong way to do it and gives an error:
tmp.diff()/tmp.index.diff()
I have figured out that I can do it by converting all the data to a DataFrame, but that seems inefficient. Especially, since I'm going to have to work with a large, on disk file in chunks. Is there a better way to do it other than this:
df=pd.DataFrame(tmp)
df['secvalue']=df.index.astype(np.int64)/1e+9
df['slope']=df['Value'].diff()/df['secvalue'].diff()
Use numpy.gradient
import numpy as np
import pandas as pd
slope = pd.Series(np.gradient(tmp.data), tmp.index, name='slope')
To address the unequal temporal index, i'd resample over minutes and interpolate. Then my gradients would be over equal intervals.
tmp_ = tmp.resample('T').interpolate()
slope = pd.Series(np.gradient(tmp_.data), tmp_.index, name='slope')
df = pd.concat([tmp_.rename('data'), slope], axis=1)
df
df.plot()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With