Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding Tensorflow LSTM Input shape

Tags:

I have a dataset X which consists N = 4000 samples, each sample consists of d = 2 features (continuous values) spanning back t = 10 time steps. I also have the corresponding 'labels' of each sample which are also continuous values, at time step 11.

At the moment my dataset is in the shape X: [4000,20], Y: [4000].

I want to train an LSTM using TensorFlow to predict the value of Y (regression), given the 10 previous inputs of d features, but I am having a tough time implementing this in TensorFlow.

The main problem I have at the moment is understanding how TensorFlow is expecting the input to be formatted. I have seen various examples such as this, but these examples deal with one big string of continuous time series data. My data is different samples, each an independent time series.

like image 390
Renier Botha Avatar asked Sep 05 '16 05:09

Renier Botha


People also ask

What is the input shape of LSTM?

The input of the LSTM is always is a 3D array.

What should be the dimension of input to a LSTM layer?

We have a simple LSTM model (4 gates) here, which feeds into a dense output layer. The model takes an input of three dimensions: batch size, time stamp and features. As is the case with all Keras layers, batch size is not a mandatory argument, but the other two need to be given.

What are the inputs of LSTM cell?

There are three different gates in an LSTM cell: a forget gate, an input gate, and an output gate.


1 Answers

The documentation of tf.nn.dynamic_rnn states:

inputs: The RNN inputs. If time_major == False (default), this must be a Tensor of shape: [batch_size, max_time, ...], or a nested tuple of such elements.

In your case, this means that the input should have a shape of [batch_size, 10, 2]. Instead of training on all 4000 sequences at once, you'd use only batch_size many of them in each training iteration. Something like the following should work (added reshape for clarity):

batch_size = 32 # batch_size sequences of length 10 with 2 values for each timestep input = get_batch(X, batch_size).reshape([batch_size, 10, 2]) # Create LSTM cell with state size 256. Could also use GRUCell, ... # Note: state_is_tuple=False is deprecated; # the option might be completely removed in the future cell = tf.nn.rnn_cell.LSTMCell(256, state_is_tuple=True) outputs, state = tf.nn.dynamic_rnn(cell,                                    input,                                    sequence_length=[10]*batch_size,                                    dtype=tf.float32) 

From the documentation, outputs will be of shape [batch_size, 10, 256], i.e. one 256-output for each timestep. state will be a tuple of shapes [batch_size, 256]. You could predict your final value, one for each sequence, from that:

predictions = tf.contrib.layers.fully_connected(state.h,                                                 num_outputs=1,                                                 activation_fn=None) loss = get_loss(get_batch(Y).reshape([batch_size, 1]), predictions) 

The number 256 in the shapes of outputs and state is determined by cell.output_size resp. cell.state_size. When creating the LSTMCell like above, these are the same. Also see the LSTMCell documentation.

like image 103
fwalch Avatar answered Sep 21 '22 01:09

fwalch