I would like to build a toy LSTM model for regression. This nice tutorial is already too complicated for a beginner.
Given a sequence of length time_steps
, predict the next value. Consider time_steps=3
and the sequences:
array([
[[ 1.],
[ 2.],
[ 3.]],
[[ 2.],
[ 3.],
[ 4.]],
...
the target values should be:
array([ 4., 5., ...
I define the following model:
# Network Parameters
time_steps = 3
num_neurons= 64 #(arbitrary)
n_features = 1
# tf Graph input
x = tf.placeholder("float", [None, time_steps, n_features])
y = tf.placeholder("float", [None, 1])
# Define weights
weights = {
'out': tf.Variable(tf.random_normal([n_hidden, 1]))
}
biases = {
'out': tf.Variable(tf.random_normal([1]))
}
#LSTM model
def lstm_model(X, weights, biases, learning_rate=0.01, optimizer='Adagrad'):
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (batch_size, time_steps, n_features)
# Required shape: 'time_steps' tensors list of shape (batch_size, n_features)
# Permuting batch_size and time_steps
input dimension: Tensor("Placeholder_:0", shape=(?, 3, 1), dtype=float32)
X = tf.transpose(X, [1, 0, 2])
transposed dimension: Tensor("transpose_41:0", shape=(3, ?, 1), dtype=float32)
# Reshaping to (time_steps*batch_size, n_features)
X = tf.reshape(X, [-1, n_features])
reshaped dimension: Tensor("Reshape_:0", shape=(?, 1), dtype=float32)
# Split to get a list of 'time_steps' tensors of shape (batch_size, n_features)
X = tf.split(0, time_steps, X)
splitted dimension: [<tf.Tensor 'split_:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:1' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:2' shape=(?, 1) dtype=float32>]
# LSTM cell
cell = tf.nn.rnn_cell.LSTMCell(num_neurons) #Or GRUCell(num_neurons)
output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)
output = tf.transpose(output, [1, 0, 2])
last = tf.gather(output, int(output.get_shape()[0]) - 1)
return tf.matmul(last, weights['out']) + biases['out']
We instantiating the LSTM model with pred = lstm_model(x, weights, biases)
I get the following:
---> output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)
ValueError: Dimension must be 2 but is 3 for 'transpose_42' (op: 'Transpose') with input shapes: [?,1], [3]
1) Do you know what the problem is?
2) Will multiplying the LSTM output by the weights yield the regression?
As discussed in the comments, the tf.nn.dynamic_rnn(cell, inputs, ...)
function expects a list of three-dimensional tensors* as its inputs
argument, where the dimensions are interpreted by default as batch_size
x num_timesteps
x num_features
. (If you pass time_major=True
, they are interpreted as num_timesteps
x batch_size
x num_features
.) Therefore the preprocessing you've done in the original placeholder is unnecessary, and you can pass the oriding X
value directly to tf.nn.dynamic_rnn()
.
* Technically it can accept complicated nested structures in addition to lists, but the leaf elements must be three-dimensional tensors.**
** Investigating this turned up a bug in the implementation of tf.nn.dynamic_rnn()
. In principle, it should be sufficient for the inputs to have at least two dimensions, but the time_major=False
path assumes that they have exactly three dimensions when it transposes the input into the time-major form, and it was the error message that this bug inadvertently causes that showed up in your program. We're working on getting that fixed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With