Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't tensorflow determine the shape of this expression?

Tags:

tensorflow

I have the following expression which is giving me problems. I have defined the batch_size as batch_size = tf.shape(input_tensor)[0] which dynamically determines the size of the batch based on the size of the input tensor to the model. I have used it elsewhere in the code without issue. What I am confused about is that when I run the following line of code it says the shape is (?, ?) I would expect it to be (?, 128) because it knows the second dimension.

print(tf.zeros((batch_size, 128)).get_shape())

I want to know the shape since I am trying to do the following and I am getting an error.

    rnn_input = tf.reduce_sum(w * decoder_input, 1)
    last_out = decoder_outputs[t - 1] if t else tf.zeros((batch_size, 128))
    rnn_input = tf.concat(1, (rnn_input, last_out))

This code needs to set last_out to zero on the first time step.

Here is the error ValueError: Linear expects shape[1] of arguments: [[None, None], [None, 1024]]

I am doing something similar when I determine my initial state vector for the RNNs.

state = tf.zeros((batch_size, decoder_multi_rnn.state_size), tf.float32)

I also get (?, ?) when I try to print the size of state but it does not really throw any exceptions when I try to use it.

like image 770
chasep255 Avatar asked Jun 17 '16 12:06

chasep255


2 Answers

You are mixing static shapes and dynamic shapes. Static shape is what you get during tensor.get_shape(tensor) which is best-effort attempt to obtain shape, while dynamic shape comes from sess.run(tf.shape(tensor)) and it is always defined.

To be more precise, tf.shape(tensor) creates an op in the graph that will produce shape tensor on run call. If you do aop=tf.shape(tensor)[0], there's some magic through _SliceHelper that adds extra ops that will extract first element of the shape tensor on run call.

This means that myval=tf.zeros((aop, 128)) has to run aop to obtain the dimensions and this means that first dimension of myval is undefined until you issue the run call. IE, your run call could look like sess.run(myval, feed_dict={aop:2}, where feed_dict overrides aop with 2. Hence static shape inference reports ? for that dimension.

like image 199
Yaroslav Bulatov Avatar answered Sep 22 '22 09:09

Yaroslav Bulatov


(EDIT: I rewrite an answer as what I wrote before was not up to the point)

The quick fix to your issue is to use set_shape() to update the static (inferred) shape of the Tensor:

input_tensor = tf.placeholder(tf.float32, [None, 32])
batch_size = tf.shape(input_tensor)[0]

res = tf.zeros((batch_size, 128))
print res.get_shape()  # prints (?, ?) WHEREAS one could expect (?, 128)

res.set_shape([None, 128])
print res.get_shape()  # prints (?, 128)

As for why TensorFlow looses the information about the second dimension being 128, I don't really know.

Maybe @Yaroslav will be able to answer.

EDIT: The incorrect behavior was corrected following this issue.

like image 34
Olivier Moindrot Avatar answered Sep 22 '22 09:09

Olivier Moindrot