I have a dataset X which consists N = 4000 samples, each sample consists of d = 2 features (continuous values) spanning back t = 10 time steps. I also have the corresponding 'labels' of each sample which are also continuous values, at time step 11.
At the moment my dataset is in the shape X: [4000,20], Y: [4000].
I want to train an LSTM using TensorFlow to predict the value of Y (regression), given the 10 previous inputs of d features, but I am having a tough time implementing this in TensorFlow.
The main problem I have at the moment is understanding how TensorFlow is expecting the input to be formatted. I have seen various examples such as this, but these examples deal with one big string of continuous time series data. My data is different samples, each an independent time series.
The input of the LSTM is always is a 3D array.
We have a simple LSTM model (4 gates) here, which feeds into a dense output layer. The model takes an input of three dimensions: batch size, time stamp and features. As is the case with all Keras layers, batch size is not a mandatory argument, but the other two need to be given.
There are three different gates in an LSTM cell: a forget gate, an input gate, and an output gate.
The documentation of tf.nn.dynamic_rnn
states:
inputs
: The RNN inputs. Iftime_major == False
(default), this must be a Tensor of shape:[batch_size, max_time, ...]
, or a nested tuple of such elements.
In your case, this means that the input should have a shape of [batch_size, 10, 2]
. Instead of training on all 4000 sequences at once, you'd use only batch_size
many of them in each training iteration. Something like the following should work (added reshape for clarity):
batch_size = 32 # batch_size sequences of length 10 with 2 values for each timestep input = get_batch(X, batch_size).reshape([batch_size, 10, 2]) # Create LSTM cell with state size 256. Could also use GRUCell, ... # Note: state_is_tuple=False is deprecated; # the option might be completely removed in the future cell = tf.nn.rnn_cell.LSTMCell(256, state_is_tuple=True) outputs, state = tf.nn.dynamic_rnn(cell, input, sequence_length=[10]*batch_size, dtype=tf.float32)
From the documentation, outputs
will be of shape [batch_size, 10, 256]
, i.e. one 256-output for each timestep. state
will be a tuple of shapes [batch_size, 256]
. You could predict your final value, one for each sequence, from that:
predictions = tf.contrib.layers.fully_connected(state.h, num_outputs=1, activation_fn=None) loss = get_loss(get_batch(Y).reshape([batch_size, 1]), predictions)
The number 256 in the shapes of outputs
and state
is determined by cell.output_size
resp. cell.state_size
. When creating the LSTMCell
like above, these are the same. Also see the LSTMCell documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With