I have a Time series data for almost 5 years. Using this data I want to forecast next 2 years. How to do this?
I referred many websites regarding this. I noticed that mostly predictions are done only with same set of data used for training they are not forecasting for future such as for next 30 days. If it possible to achieve this via TensorFlow. May I know how to achieve this?
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dropout
dataset_train = pd.read_csv(r'C:\Users\Kavin\source\repos\SampleTensorFlow\SampleTensorFlow\data\traindataset.csv')
training_set = dataset_train.iloc[:, 1:2].values
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)
X_train = []
y_train = []
for i in range(60, 2035):
X_train.append(training_set_scaled[i-60:i, 0])
y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
regressor = Sequential()
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = 1))
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')
regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)
dataset_test = pd.read_csv(r'C:\Users\Kavin\source\repos\SampleTensorFlow\SampleTensorFlow\data\testdataset.csv')
result = dataset_test[['Date','Open']]
real_stock_price = dataset_test.iloc[:, 1:2].values
dataset_total = pd.concat((dataset_train['Open'], dataset_test['Open']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 76):
X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_stock_price = regressor.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
result['PredictedResult'] = pd.Series(predicted_stock_price.ravel(), index=result.index)
result.to_csv(r"C:\Users\Kavin\Downloads\PredictedStocks.csv", index=False)
ax = plt.gca()
result.plot(kind='line', x='Date', y='Open', color='red', label = 'Real Stock Price', ax=ax)
result.plot(kind='line', x='Date', y='PredictedResult', color='blue', label = 'Predicted Stock Price', ax=ax)
plt.show()
for all machine learning problem you want to ask yourself the question "What do i want to predict and what data do i have ?"
In your case you want to predict values at an undefined time in the future, let's call that time T.
We suppose that your current data is labelled ie. for each sample/row (x) you have a corresponding value (y). Let xt be the timestamp of your x data
If you want to predict y at time xt + T then you must feed your algorithm with data such as for each sample x, the corresponding label is y at time xt + T.
This way your algorithm will "learn" to predict the value of y at time xt + T from data at time xt
With Pandas, this can be achieved with shift.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With