Here is my code fore training the complete model and saving it: <pre class="prettyprint"><code>num_units = 2 activation_function = 'sigmoid' optimizer = 'adam' loss_function = 'mean_squared_error' batch_size = 10 num_epochs = 100 # Initialize the RNN regressor = Sequential() # Adding the input layer and the LSTM layer regressor.add(LSTM(units = num_units, activation = activation_function, input_shape=(None, 1))) # Adding the output layer regressor.add(Dense(units = 1)) # Compiling the RNN regressor.compile(optimizer = optimizer, loss = loss_function) # Using the training set to train the model regressor.fit(x_train, y_train, batch_size = batch_size, epochs = num_epochs) regressor.save('model.h5') </code></pre> After that I have seen that most of the time people our suggesting the test dataset for checking the prediction which I have attempted as well and got good result. But the problem is in the usage of the model that I have created. I want to have a forecast for next 30 days or every minute whatsoever. Now I have the trained model but I am not getting what I can do or what code do I use to use the model and forecast the prices for next 30 days or one minute. Please suggest me the way out. I am stuck at this problem since a week and not able to make any successful attempts. Here is the link of the repository where one can find the complete runnable code, the model, and the dataset: My repository link

Well, you need a <code>stateful=True</code> model, so you can feed it one prediction after another to get the next and keep the model thinking that each input is not a new sequence, but a sequel to the previous. Fixing the code and training I see in the code that there is an attempt to make your <code>y</code> be a shifte <code>x</code> (a good option for predicting the next steps). But there is also a big problem in the preprocessing here: <pre class="prettyprint"><code>training_set = df_train.values training_set = min_max_scaler.fit_transform(training_set) x_train = training_set[0:len(training_set)-1] y_train = training_set[1:len(training_set)] x_train = np.reshape(x_train, (len(x_train), 1, 1)) </code></pre> Data for <code>LSTM</code> layers must be shaped as <code>(number_of_sequences, number_of_steps,features)</code>. So, you're clearly creating sequences of 1 step only, meaning that your LSTM is not learning sequences at all. (There is no sequence with only one step). Assuming that your data is a single unique sequence with 1 feature, it should definitely be shaped as <code>(1, len(x_train), 1)</code>. Naturally, <code>y_train</code> should also have the same shape. This, in its turn, will require that your LSTM layers be <code>return_sequences=True</code> - The only way to make <code>y</code> have a length in steps. Also, for having a good prediction, you may need a more complex model (because now it will be trully learning). This done, you train your model until you get a satisfactory result. <hr> Predicting the future For predicting the future, you will need <code>stateful=True</code> LSTM layers. Before anything, you reset the model's states: <code>model.reset_states()</code> - Necessary every time you're inputting a new sequence into a stateful model. Then, first you predict the entire <code>X_train</code> (this is needed for the model to understand at which point of the sequence it is, in technical words: to create a state). <pre class="prettyprint"><code>predictions = model.predict(`X_train`) #this creates states </code></pre> And finally you create a loop where you start with the last step of the previous prediction: <pre class="prettyprint"><code>future = [] currentStep = predictions[:,-1:,:] #last step from the previous prediction for i in range(future_pred_count): currentStep = model.predict(currentStep) #get the next step future.append(currentStep) #store the future steps #after processing a sequence, reset the states for safety model.reset_states() </code></pre> <hr> Example This code does this with a 2-feature sequence, a shifted future step prediction, and a method that is a little different from this answer, but based on the same principle. I created two models (one <code>stateful=False</code>, for training without needing to reset states every time - never forget to reset states when you're starting a new sequence - and the other <code>stateful=True</code>, copying the weights from the trained model, for predicting the future) https://github.com/danmoller/TestRepo/blob/master/TestBookLSTM.ipynb

What you need to do in order to predict future values with RNNs is to provide data as sequences. Something like this: <pre class="prettyprint"><code>[0 1 2] --> [3] [1 2 3] --> [4] [2 3 4] --> [5] [3 4 5] --> [6] [4 5 6] --> [7] </code></pre> RNNs learn the structure of sequences, and therefore need a unique input shape: <pre class="prettyprint"><code>(n_samples, time_steps, n_features) </code></pre> For instance, the time steps could be 7 if you use every day of the last week. <h3>How can I create a dataset for RNNs?</h3> <ol> <li><code>tf.keras.preprocessing.timeseries_dataset_from_array</code></li> </ol> What you'll need to do is provide this function with a) present values, and b) future values. Here, <code>seq_length</code> is the number of time steps to use. <pre class="prettyprint"><code>import tensorflow as tf seq_length = 3 x = tf.range(25)[:-seq_length] y = tf.range(25)[seq_length:] ds = tf.keras.preprocessing.timeseries_dataset_from_array(x, y, sequence_length=seq_length, batch_size=1) for present_values, next_value in ds.take(5): print(tf.squeeze(present_values).numpy(), '-->', next_value.numpy()) </code></pre> <pre class="prettyprint"><code>[0 1 2] --> [3] [1 2 3] --> [4] [2 3 4] --> [5] [3 4 5] --> [6] [4 5 6] --> [7] </code></pre> You can also do the same for multiple variables: <pre class="prettyprint"><code>import tensorflow as tf seq_length = 3 x = tf.concat([ tf.reshape(tf.range(25, dtype=tf.float32)[:-seq_length], (-1, 1)), tf.reshape(tf.linspace(0., .24, 25) [:-seq_length], (-1, 1))], axis=-1) y = tf.concat([ tf.reshape(tf.range(25, dtype=tf.float32)[seq_length:], (-1, 1)), tf.reshape(tf.linspace(0., .24, 25) [seq_length:], (-1, 1))], axis=-1) ds = tf.keras.preprocessing.timeseries_dataset_from_array(x, y, sequence_length=seq_length, batch_size=1) for present_values, next_value in ds.take(5): print(tf.squeeze(present_values).numpy(), '-->', tf.squeeze(next_value).numpy()) model = tf.keras.Sequential([ tf.keras.layers.LSTM(8), tf.keras.layers.Dense(8, activation='relu'), tf.keras.layers.Dense(2) ]) model.compile(loss='mae', optimizer='adam') history = model.fit(ds) </code></pre> <pre class="prettyprint"><code>[[0. 0. ] [1. 0.01] [2. 0.02]] --> [3. 0.03] [[1. 0.01] [2. 0.02] [3. 0.03]] --> [4. 0.04] [[2. 0.02] [3. 0.03] [4. 0.04]] --> [5. 0.05] [[3. 0.03] [4. 0.04] [5. 0.05]] --> [6. 0.06] [[4. 0.04] [5. 0.05] [6. 0.06]] --> [7. 0.07] </code></pre> <ol start="2"> <li>This function</li> </ol> <pre class="prettyprint"><code>import tensorflow as tf import numpy as np x = np.arange(25) def univariate_data(dataset, start_index, end_index, history_size, target_size): data, labels = [], [] start_index = start_index + history_size if end_index is None: end_index = len(dataset) - target_size for i in range(start_index, end_index): indices = np.arange(i-history_size, i) data.append(np.reshape(dataset[indices], (history_size, 1))) labels.append(dataset[i:i+target_size]) return np.array(data), np.array(labels) present_values, future_values = univariate_data(x, 0, 9, 3, 3) for present, next_val in zip(present_values, future_values): print(tf.squeeze(present).numpy(), '-->', tf.squeeze(next_val).numpy()) </code></pre> <pre class="prettyprint"><code>[0 1 2] --> [3 4] [1 2 3] --> [4 5] [2 3 4] --> [5 6] [3 4 5] --> [6 7] [4 5 6] --> [7 8] [5 6 7] --> [8 9] </code></pre> And now for multiple variables: <pre class="prettyprint"><code>import tensorflow as tf import numpy as np history_size = 3 x = np.concatenate([np.expand_dims(np.arange(25), 1)[:-history_size], np.expand_dims(np.linspace(0., .24, 25), 1)[:-history_size]], axis=1) y = np.concatenate([np.expand_dims(np.arange(25), 1)[history_size:], np.expand_dims(np.linspace(0., .24, 25), 1)[history_size:]], axis=1) def multivariate_data(dataset, target, start_index, end_index, history_size, target_size, step, single_step=False): data = [] labels = [] start_index = start_index + history_size if end_index is None: end_index = len(dataset) - target_size for i in range(start_index, end_index): indices = range(i-history_size, i, step) data.append(dataset[indices]) if single_step: labels.append(target[i+target_size]) else: labels.append(target[i:i+target_size]) return np.array(data), np.array(labels) present_values, future_values = multivariate_data(x, y, 0, 8, history_size, 1, 1) for present, next_val in zip(present_values, future_values): print(tf.squeeze(present).numpy(), '-->', tf.squeeze(next_val).numpy()) </code></pre> <pre class="prettyprint"><code>[[0. 0. ] [1. 0.01] [2. 0.02]] --> [6. 0.06] [[1. 0.01] [2. 0.02] [3. 0.03]] --> [7. 0.07] [[2. 0.02] [3. 0.03] [4. 0.04]] --> [8. 0.08] [[3. 0.03] [4. 0.04] [5. 0.05]] --> [9. 0.09] [[4. 0.04] [5. 0.05] [6. 0.06]] --> [10. 0.1] </code></pre> <ol start="3"> <li><code>tf.data.Dataset.window</code></li> </ol> <pre class="prettyprint"><code>import tensorflow as tf import numpy as np history_size = 3 lookahead = 2 x = tf.range(8) ds = tf.data.Dataset.from_tensor_slices(x) ds = ds.window(history_size + lookahead, shift=1, drop_remainder=True) ds = ds.flat_map(lambda window: window.batch(history_size + lookahead)) ds = ds.map(lambda window: (window[:-lookahead], window[-lookahead:])) for present_values, next_value in ds: print(present_values.numpy(), '-->', next_value.numpy()) </code></pre> <pre class="prettyprint"><code>[0 1 2] --> [3 4] [1 2 3] --> [4 5] [2 3 4] --> [5 6] [3 4 5] --> [6 7] </code></pre> With multiple variables: <pre class="prettyprint"><code>import tensorflow as tf import numpy as np history_size = 3 lookahead = 2 x = tf.concat([ tf.reshape(tf.range(20, dtype=tf.float32), (-1, 1)), tf.reshape(tf.linspace(0., .19, 20), (-1, 1))], axis=-1) ds = tf.data.Dataset.from_tensor_slices(x) ds = ds.window(history_size + lookahead, shift=1, drop_remainder=True) ds = ds.flat_map(lambda window: window.batch(history_size + lookahead)) ds = ds.map(lambda window: (window[:-lookahead], window[-lookahead:])) for present_values, next_value in ds.take(8): print(tf.squeeze(np.round(present_values, 2)).numpy(), '-->', tf.squeeze(np.round(next_value, 2)).numpy()) print() </code></pre> <pre class="prettyprint"><code>[[0. 0. ] [1. 0.01] [2. 0.02]] --> [[3. 0.03] [4. 0.04]] [[1. 0.01] [2. 0.02] [3. 0.03]] --> [[4. 0.04] [5. 0.05]] [[2. 0.02] [3. 0.03] [4. 0.04]] --> [[5. 0.05] [6. 0.06]] [[3. 0.03] [4. 0.04] [5. 0.05]] --> [[6. 0.06] [7. 0.07]] [[4. 0.04] [5. 0.05] [6. 0.06]] --> [[7. 0.07] [8. 0.08]] [[5. 0.05] [6. 0.06] [7. 0.07]] --> [[8. 0.08] [9. 0.09]] </code></pre>

How to use a Keras RNN model to forecast for future dates or events?

Tags:

python

tensorflow

keras

Here is my code fore training the complete model and saving it:

num_units = 2
activation_function = 'sigmoid'
optimizer = 'adam'
loss_function = 'mean_squared_error'
batch_size = 10
num_epochs = 100

# Initialize the RNN
regressor = Sequential()

# Adding the input layer and the LSTM layer
regressor.add(LSTM(units = num_units, activation = activation_function, input_shape=(None, 1)))

# Adding the output layer
regressor.add(Dense(units = 1))

# Compiling the RNN
regressor.compile(optimizer = optimizer, loss = loss_function)

# Using the training set to train the model
regressor.fit(x_train, y_train, batch_size = batch_size, epochs = num_epochs)
regressor.save('model.h5')

After that I have seen that most of the time people our suggesting the test dataset for checking the prediction which I have attempted as well and got good result.

But the problem is in the usage of the model that I have created. I want to have a forecast for next 30 days or every minute whatsoever. Now I have the trained model but I am not getting what I can do or what code do I use to use the model and forecast the prices for next 30 days or one minute.

Please suggest me the way out. I am stuck at this problem since a week and not able to make any successful attempts.

Here is the link of the repository where one can find the complete runnable code, the model, and the dataset: My repository link

652

asked Feb 13 '18 06:02

Jaffer Wilson

2 Answers

Well, you need a stateful=True model, so you can feed it one prediction after another to get the next and keep the model thinking that each input is not a new sequence, but a sequel to the previous.

Fixing the code and training

I see in the code that there is an attempt to make your y be a shifte x (a good option for predicting the next steps). But there is also a big problem in the preprocessing here:

training_set = df_train.values
training_set = min_max_scaler.fit_transform(training_set)

x_train = training_set[0:len(training_set)-1]
y_train = training_set[1:len(training_set)]
x_train = np.reshape(x_train, (len(x_train), 1, 1))

Data for LSTM layers must be shaped as (number_of_sequences, number_of_steps,features).

So, you're clearly creating sequences of 1 step only, meaning that your LSTM is not learning sequences at all. (There is no sequence with only one step).

Assuming that your data is a single unique sequence with 1 feature, it should definitely be shaped as (1, len(x_train), 1).

Naturally, y_train should also have the same shape.

This, in its turn, will require that your LSTM layers be return_sequences=True - The only way to make y have a length in steps. Also, for having a good prediction, you may need a more complex model (because now it will be trully learning).

This done, you train your model until you get a satisfactory result.

Predicting the future

For predicting the future, you will need stateful=True LSTM layers.

Before anything, you reset the model's states: model.reset_states() - Necessary every time you're inputting a new sequence into a stateful model.

Then, first you predict the entire X_train (this is needed for the model to understand at which point of the sequence it is, in technical words: to create a state).

predictions = model.predict(`X_train`) #this creates states

And finally you create a loop where you start with the last step of the previous prediction:

future = []
currentStep = predictions[:,-1:,:] #last step from the previous prediction

for i in range(future_pred_count):
    currentStep = model.predict(currentStep) #get the next step
    future.append(currentStep) #store the future steps    

#after processing a sequence, reset the states for safety
model.reset_states()

Example

This code does this with a 2-feature sequence, a shifted future step prediction, and a method that is a little different from this answer, but based on the same principle.

I created two models (one stateful=False, for training without needing to reset states every time - never forget to reset states when you're starting a new sequence - and the other stateful=True, copying the weights from the trained model, for predicting the future)

https://github.com/danmoller/TestRepo/blob/master/TestBookLSTM.ipynb

124

answered Oct 19 '22 05:10

Daniel Möller

What you need to do in order to predict future values with RNNs is to provide data as sequences. Something like this:

[0 1 2] --> [3]
[1 2 3] --> [4]
[2 3 4] --> [5]
[3 4 5] --> [6]
[4 5 6] --> [7]

RNNs learn the structure of sequences, and therefore need a unique input shape:

(n_samples, time_steps, n_features)

For instance, the time steps could be 7 if you use every day of the last week.

How can I create a dataset for RNNs?

tf.keras.preprocessing.timeseries_dataset_from_array

What you'll need to do is provide this function with a) present values, and b) future values. Here, seq_length is the number of time steps to use.

import tensorflow as tf

seq_length = 3

x = tf.range(25)[:-seq_length]

y = tf.range(25)[seq_length:]

ds = tf.keras.preprocessing.timeseries_dataset_from_array(x, y,
                                                          sequence_length=seq_length,
                                                          batch_size=1)

for present_values, next_value in ds.take(5):
    print(tf.squeeze(present_values).numpy(), '-->', next_value.numpy())

[0 1 2] --> [3]
[1 2 3] --> [4]
[2 3 4] --> [5]
[3 4 5] --> [6]
[4 5 6] --> [7]

You can also do the same for multiple variables:

import tensorflow as tf

seq_length = 3

x = tf.concat([
    tf.reshape(tf.range(25, dtype=tf.float32)[:-seq_length], (-1, 1)),
    tf.reshape(tf.linspace(0., .24, 25)      [:-seq_length], (-1, 1))], axis=-1)

y = tf.concat([
    tf.reshape(tf.range(25, dtype=tf.float32)[seq_length:], (-1, 1)),
    tf.reshape(tf.linspace(0., .24, 25)      [seq_length:], (-1, 1))], axis=-1)

ds = tf.keras.preprocessing.timeseries_dataset_from_array(x, y,
                                                          sequence_length=seq_length,
                                                          batch_size=1)

for present_values, next_value in ds.take(5):
    print(tf.squeeze(present_values).numpy(), '-->', tf.squeeze(next_value).numpy())
    
model = tf.keras.Sequential([
    tf.keras.layers.LSTM(8),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dense(2)
])

model.compile(loss='mae', optimizer='adam')

history = model.fit(ds)

[[0.   0.  ]
 [1.   0.01]
 [2.   0.02]] --> [3.   0.03]
[[1.   0.01]
 [2.   0.02]
 [3.   0.03]] --> [4.   0.04]
[[2.   0.02]
 [3.   0.03]
 [4.   0.04]] --> [5.   0.05]
[[3.   0.03]
 [4.   0.04]
 [5.   0.05]] --> [6.   0.06]
[[4.   0.04]
 [5.   0.05]
 [6.   0.06]] --> [7.   0.07]

This function

import tensorflow as tf
import numpy as np

x = np.arange(25)

def univariate_data(dataset, start_index, end_index, history_size, target_size):
    data, labels = [], []

    start_index = start_index + history_size
    if end_index is None:
        end_index = len(dataset) - target_size

    for i in range(start_index, end_index):
        indices = np.arange(i-history_size, i)
        data.append(np.reshape(dataset[indices], (history_size, 1)))
        labels.append(dataset[i:i+target_size])
    return np.array(data), np.array(labels)

present_values, future_values = univariate_data(x, 0, 9, 3, 3)

for present, next_val in zip(present_values, future_values):
    print(tf.squeeze(present).numpy(), '-->', tf.squeeze(next_val).numpy())

[0 1 2] --> [3 4]
[1 2 3] --> [4 5]
[2 3 4] --> [5 6]
[3 4 5] --> [6 7]
[4 5 6] --> [7 8]
[5 6 7] --> [8 9]

And now for multiple variables:

import tensorflow as tf
import numpy as np

history_size = 3

x = np.concatenate([np.expand_dims(np.arange(25), 1)[:-history_size],
                    np.expand_dims(np.linspace(0., .24, 25), 1)[:-history_size]], axis=1)

y = np.concatenate([np.expand_dims(np.arange(25), 1)[history_size:],
                    np.expand_dims(np.linspace(0., .24, 25), 1)[history_size:]], axis=1)


def multivariate_data(dataset, target, start_index, end_index, history_size,
                      target_size, step, single_step=False):
  data = []
  labels = []
  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size
  for i in range(start_index, end_index):
    indices = range(i-history_size, i, step)
    data.append(dataset[indices])
    if single_step:
      labels.append(target[i+target_size])
    else:
      labels.append(target[i:i+target_size])

  return np.array(data), np.array(labels)

present_values, future_values = multivariate_data(x, y, 0, 8, history_size, 1, 1)

for present, next_val in zip(present_values, future_values):
    print(tf.squeeze(present).numpy(), '-->', tf.squeeze(next_val).numpy())

[[0.   0.  ]
 [1.   0.01]
 [2.   0.02]] --> [6.   0.06]
[[1.   0.01]
 [2.   0.02]
 [3.   0.03]] --> [7.   0.07]
[[2.   0.02]
 [3.   0.03]
 [4.   0.04]] --> [8.   0.08]
[[3.   0.03]
 [4.   0.04]
 [5.   0.05]] --> [9.   0.09]
[[4.   0.04]
 [5.   0.05]
 [6.   0.06]] --> [10.   0.1]

tf.data.Dataset.window

import tensorflow as tf
import numpy as np

history_size = 3
lookahead = 2

x = tf.range(8)

ds = tf.data.Dataset.from_tensor_slices(x)
ds = ds.window(history_size + lookahead, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda window: window.batch(history_size + lookahead))
ds = ds.map(lambda window: (window[:-lookahead], window[-lookahead:]))

for present_values, next_value in ds:
    print(present_values.numpy(), '-->', next_value.numpy())

[0 1 2] --> [3 4]
[1 2 3] --> [4 5]
[2 3 4] --> [5 6]
[3 4 5] --> [6 7]

With multiple variables:

import tensorflow as tf
import numpy as np

history_size = 3
lookahead = 2

x = tf.concat([
    tf.reshape(tf.range(20, dtype=tf.float32), (-1, 1)),
    tf.reshape(tf.linspace(0., .19, 20), (-1, 1))], axis=-1)

ds = tf.data.Dataset.from_tensor_slices(x)
ds = ds.window(history_size + lookahead, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda window: window.batch(history_size + lookahead))
ds = ds.map(lambda window: (window[:-lookahead], window[-lookahead:]))

for present_values, next_value in ds.take(8):
    print(tf.squeeze(np.round(present_values, 2)).numpy(), '-->',
          tf.squeeze(np.round(next_value, 2)).numpy())
    print()

[[0.   0.  ]
 [1.   0.01]
 [2.   0.02]] --> [[3.   0.03]
                   [4.   0.04]]
[[1.   0.01]
 [2.   0.02]
 [3.   0.03]] --> [[4.   0.04]
                   [5.   0.05]]
[[2.   0.02]
 [3.   0.03]
 [4.   0.04]] --> [[5.   0.05]
                   [6.   0.06]]
[[3.   0.03]
 [4.   0.04]
 [5.   0.05]] --> [[6.   0.06]
                   [7.   0.07]]
[[4.   0.04]
 [5.   0.05]
 [6.   0.06]] --> [[7.   0.07]
                   [8.   0.08]]
[[5.   0.05]
 [6.   0.06]
 [7.   0.07]] --> [[8.   0.08]
                   [9.   0.09]]

answered Oct 19 '22 05:10

Nicolas Gervais

Related questions
                            
                                Jupyter: disable restart kernel warning
                            
                                Python Click: Having the group execute code AFTER a command
                            
                                Docker Django 404 for web static files, but fine for admin static files
                            
                                Can pandas read a transposed CSV?
                            
                                How to select last row and also how to access PySpark dataframe by index?
                            
                                Determining Pandas Column DataType
                            
                                Django breaking long lookup names on queries
                            
                                Django queryset annotate field to be a list/queryset
                            
                                How to add black border to matplotlib 2.0 `ax` object In Python 3?
                            
                                How to connect broken lines in a binary image using Python/Opencv
                            
                                Keras input_shape for conv2d and manually loaded images
                            
                                Matplotlib: 3D surface plot turn off background but keep axes
                            
                                TypeError: unhashable type: 'list' when use groupby in python
                            
                                How to split a string into command line arguments like the shell in python?
                            
                                What is "Pure Python?"
                            
                                Concat two arrays of different dimensions numpy
                            
                                py.test deal with both pylint and flake8 when importing features from a module
                            
                                Mouse scroll wheel with selenium webdriver, on element without scrollbar?
                            
                                Can I put a class definition into __init__.py?
                            
                                Using sample_weight in Keras for sequence labelling

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With