I have daily stock price data from yahoo finance in a dataframe called price_data
.
I would like to add a column to this which provides the fitted value from a time series trend of the Adj Close
column.
Here is the structure of the data I am using:
In [41]: type(price_data)
Out[41]: pandas.core.frame.DataFrame
In [42]: list(price_data.columns.values)
Out[42]: ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']
In [45]: type(price_data.index)
Out[45]: pandas.tseries.index.DatetimeIndex
What is the neatest way of achieving this in the Python language?
As an aside, the following achieved this in the R language
all_time_fitted <- function(data)
{
all_time_model <- lm(Adj.Close ~ Date, data=data)
fitted_value <- predict(all_time_model)
return(fitted_value)
}
Here is some sample data:
In [3]: price_data
Out[3]:
Open High Low Close Volume Adj Close
Date
2005-09-27 21.05 21.40 19.10 19.30 961200 19.16418
2005-09-28 19.30 20.53 19.20 20.50 5747900 20.35573
2005-09-29 20.40 20.58 20.10 20.21 1078200 20.06777
2005-09-30 20.26 21.05 20.18 21.01 3123300 20.86214
2005-10-03 20.90 21.75 20.90 21.50 1057900 21.34869
2005-10-04 21.44 22.50 21.44 22.16 1768800 22.00405
2005-10-05 22.10 22.31 21.75 22.20 904300 22.04377
Each observation in a time series can be forecast using all previous observations. We call these fitted values and they are denoted by ^yt|t−1 y ^ t | t − 1 , meaning the forecast of yt based on observations y1,…,yt−1 y 1 , … , y t − 1 .
In python, we can plot these trend graphs by using matplotlib. pyplot library. It is used for plotting a figure for the given data.
Quick and dirty ...
# get some data
import pandas.io.data as web
import datetime
start = datetime.datetime(2015, 1, 1)
end = datetime.datetime(2015, 4, 30)
df=web.DataReader("F", 'yahoo', start, end)
# a bit of munging - better column name - Day as integer
df = df.rename(columns={'Adj Close':'AdjClose'})
dayZero = df.index[0]
df['Day'] = (df.index - dayZero).days
# fit a linear regression
import statsmodels.formula.api as sm
fit = sm.ols(formula="AdjClose ~ Day", data=df).fit()
print(fit.summary())
predict = fit.predict(df)
df['fitted'] = predict
# plot
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(8,4))
ax.scatter(df.index, df.AdjClose)
ax.plot(df.index, df.fitted, 'r')
ax.set_ylabel('$')
fig.suptitle('Yahoo')
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With