Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

calculating slope for a series trendline in Pandas

Is there an idiomatic way of getting the slope for linear trend line fitting values in a DataFrame column? The data is indexed with DateTime index.

like image 816
Dmitry B. Avatar asked Jul 14 '16 22:07

Dmitry B.


1 Answers

This should do it:

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(100, 5), pd.date_range('2012-01-01', periods=100))

def trend(df):
    df = df.copy().sort_index()
    dates = df.index.to_julian_date().values[:, None]
    x = np.concatenate([np.ones_like(dates), dates], axis=1)
    y = df.values
    return pd.DataFrame(np.linalg.pinv(x.T.dot(x)).dot(x.T).dot(y).T,
                        df.columns, ['Constant', 'Trend'])


trend(df)

enter image description here

Using the same df above for its index:

df_sample = pd.DataFrame((df.index.to_julian_date() * 10 + 2) + np.random.rand(100) * 1e3, df.index)

coef = trend(df_sample)
df_sample['trend'] = (coef.iloc[0, 1] * df_sample.index.to_julian_date() + coef.iloc[0, 0])
df_sample.plot(style=['.', '-'])

enter image description here

like image 116
piRSquared Avatar answered Oct 29 '22 18:10

piRSquared