Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow dynamic_rnn regressor: ValueError dimension mismatch

I would like to build a toy LSTM model for regression. This nice tutorial is already too complicated for a beginner.

Given a sequence of length time_steps, predict the next value. Consider time_steps=3 and the sequences:

array([
   [[  1.],
    [  2.],
    [  3.]],

   [[  2.],
    [  3.],
    [  4.]],
    ...

the target values should be:

array([  4.,   5., ...

I define the following model:

# Network Parameters
time_steps = 3 
num_neurons= 64 #(arbitrary)
n_features = 1

# tf Graph input
x = tf.placeholder("float", [None, time_steps, n_features])
y = tf.placeholder("float", [None, 1])

# Define weights
weights = {
   'out': tf.Variable(tf.random_normal([n_hidden, 1]))
} 
biases = {
   'out': tf.Variable(tf.random_normal([1]))
}

#LSTM model
def lstm_model(X, weights, biases, learning_rate=0.01, optimizer='Adagrad'):

  # Prepare data shape to match `rnn` function requirements
  # Current data input shape: (batch_size, time_steps, n_features)
  # Required shape: 'time_steps' tensors list of shape (batch_size, n_features)
  # Permuting batch_size and time_steps
  input dimension: Tensor("Placeholder_:0", shape=(?, 3, 1), dtype=float32)

  X = tf.transpose(X, [1, 0, 2])
  transposed dimension: Tensor("transpose_41:0", shape=(3, ?, 1), dtype=float32)

  # Reshaping to (time_steps*batch_size, n_features)
  X = tf.reshape(X, [-1, n_features])
  reshaped dimension: Tensor("Reshape_:0", shape=(?, 1), dtype=float32)

  # Split to get a list of 'time_steps' tensors of shape (batch_size, n_features)
  X = tf.split(0, time_steps, X)
  splitted dimension: [<tf.Tensor 'split_:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:1' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:2' shape=(?, 1) dtype=float32>]

  # LSTM cell
  cell = tf.nn.rnn_cell.LSTMCell(num_neurons) #Or GRUCell(num_neurons)

  output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)

  output = tf.transpose(output, [1, 0, 2])
  last = tf.gather(output, int(output.get_shape()[0]) - 1)


  return tf.matmul(last, weights['out']) + biases['out']

We instantiating the LSTM model with pred = lstm_model(x, weights, biases) I get the following:

---> output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)
ValueError: Dimension must be 2 but is 3 for 'transpose_42' (op: 'Transpose') with input shapes: [?,1], [3]

1) Do you know what the problem is?

2) Will multiplying the LSTM output by the weights yield the regression?

like image 420
mastro Avatar asked Feb 28 '17 16:02

mastro


1 Answers

As discussed in the comments, the tf.nn.dynamic_rnn(cell, inputs, ...) function expects a list of three-dimensional tensors* as its inputs argument, where the dimensions are interpreted by default as batch_size x num_timesteps x num_features. (If you pass time_major=True, they are interpreted as num_timesteps x batch_size x num_features.) Therefore the preprocessing you've done in the original placeholder is unnecessary, and you can pass the oriding X value directly to tf.nn.dynamic_rnn().


* Technically it can accept complicated nested structures in addition to lists, but the leaf elements must be three-dimensional tensors.**

** Investigating this turned up a bug in the implementation of tf.nn.dynamic_rnn(). In principle, it should be sufficient for the inputs to have at least two dimensions, but the time_major=False path assumes that they have exactly three dimensions when it transposes the input into the time-major form, and it was the error message that this bug inadvertently causes that showed up in your program. We're working on getting that fixed.

like image 165
mrry Avatar answered Oct 12 '22 23:10

mrry