Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Future prediction using time series data set with Tensorflow

I have a Time series data for almost 5 years. Using this data I want to forecast next 2 years. How to do this?

I referred many websites regarding this. I noticed that mostly predictions are done only with same set of data used for training they are not forecasting for future such as for next 30 days. If it possible to achieve this via TensorFlow. May I know how to achieve this?

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dropout

dataset_train = pd.read_csv(r'C:\Users\Kavin\source\repos\SampleTensorFlow\SampleTensorFlow\data\traindataset.csv')
training_set = dataset_train.iloc[:, 1:2].values

sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)

X_train = []
y_train = []
for i in range(60, 2035):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))


regressor = Sequential()

regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))


regressor.add(Dense(units = 1))

regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)


dataset_test = pd.read_csv(r'C:\Users\Kavin\source\repos\SampleTensorFlow\SampleTensorFlow\data\testdataset.csv')
result = dataset_test[['Date','Open']]
real_stock_price = dataset_test.iloc[:, 1:2].values


dataset_total = pd.concat((dataset_train['Open'], dataset_test['Open']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 76):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_stock_price = regressor.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)

result['PredictedResult'] = pd.Series(predicted_stock_price.ravel(), index=result.index)

result.to_csv(r"C:\Users\Kavin\Downloads\PredictedStocks.csv", index=False)

ax = plt.gca()

result.plot(kind='line', x='Date', y='Open', color='red', label = 'Real Stock Price', ax=ax)
result.plot(kind='line', x='Date', y='PredictedResult', color='blue', label = 'Predicted Stock Price', ax=ax)

plt.show()
like image 744
Clinton Prakash Avatar asked Nov 07 '22 19:11

Clinton Prakash


1 Answers

for all machine learning problem you want to ask yourself the question "What do i want to predict and what data do i have ?"

In your case you want to predict values at an undefined time in the future, let's call that time T.

We suppose that your current data is labelled ie. for each sample/row (x) you have a corresponding value (y). Let xt be the timestamp of your x data

If you want to predict y at time xt + T then you must feed your algorithm with data such as for each sample x, the corresponding label is y at time xt + T.

This way your algorithm will "learn" to predict the value of y at time xt + T from data at time xt

With Pandas, this can be achieved with shift.

like image 199
Bruce Swain Avatar answered Nov 14 '22 22:11

Bruce Swain