I have the following expression which is giving me problems. I have defined the batch_size as batch_size = tf.shape(input_tensor)[0]
which dynamically determines the size of the batch based on the size of the input tensor to the model. I have used it elsewhere in the code without issue. What I am confused about is that when I run the following line of code it says the shape is (?, ?) I would expect it to be (?, 128) because it knows the second dimension.
print(tf.zeros((batch_size, 128)).get_shape())
I want to know the shape since I am trying to do the following and I am getting an error.
rnn_input = tf.reduce_sum(w * decoder_input, 1)
last_out = decoder_outputs[t - 1] if t else tf.zeros((batch_size, 128))
rnn_input = tf.concat(1, (rnn_input, last_out))
This code needs to set last_out to zero on the first time step.
Here is the error ValueError: Linear expects shape[1] of arguments: [[None, None], [None, 1024]]
I am doing something similar when I determine my initial state vector for the RNNs.
state = tf.zeros((batch_size, decoder_multi_rnn.state_size), tf.float32)
I also get (?, ?) when I try to print the size of state but it does not really throw any exceptions when I try to use it.
You are mixing static shapes and dynamic shapes. Static shape is what you get during tensor.get_shape(tensor)
which is best-effort attempt to obtain shape, while dynamic shape comes from sess.run(tf.shape(tensor))
and it is always defined.
To be more precise, tf.shape(tensor)
creates an op in the graph that will produce shape tensor on run
call. If you do aop=tf.shape(tensor)[0]
, there's some magic through _SliceHelper that adds extra ops that will extract first element of the shape tensor on run
call.
This means that myval=tf.zeros((aop, 128))
has to run aop
to obtain the dimensions and this means that first dimension of myval
is undefined until you issue the run
call. IE, your run call could look like sess.run(myval, feed_dict={aop:2}
, where feed_dict
overrides aop
with 2. Hence static shape inference reports ?
for that dimension.
(EDIT: I rewrite an answer as what I wrote before was not up to the point)
The quick fix to your issue is to use set_shape()
to update the static (inferred) shape of the Tensor:
input_tensor = tf.placeholder(tf.float32, [None, 32])
batch_size = tf.shape(input_tensor)[0]
res = tf.zeros((batch_size, 128))
print res.get_shape() # prints (?, ?) WHEREAS one could expect (?, 128)
res.set_shape([None, 128])
print res.get_shape() # prints (?, 128)
As for why TensorFlow looses the information about the second dimension being 128, I don't really know.
Maybe @Yaroslav will be able to answer.
EDIT: The incorrect behavior was corrected following this issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With