Flattening two last dimensions of a tensor in TensorFlow

Question

I'm trying to reshape a tensor from [A, B, C, D] into [A, B, C * D] and feed it into a dynamic_rnn. Assume that I don't know the B, C, and D in advance (they're a result of a convolutional network).

I think in Theano such reshaping would look like this:

x = x.flatten(ndim=3)

It seems that in TensorFlow there's no easy way to do this and so far here's what I came up with:

x_shape = tf.shape(x)
x = tf.reshape(x, [batch_size, x_shape[1], tf.reduce_prod(x_shape[2:])]

Even when the shape of x is known during graph building (i.e. print(x.get_shape()) prints out absolute values, like [10, 20, 30, 40] after the reshaping get_shape() becomes [10, None, None]. Again, still assume the initial shape isn't known so I can't operate with absolute values.

And when I'm passing x to a dynamic_rnn it fails:

ValueError: Input size (depth of inputs) must be accessible via shape inference, but saw value None.

Why is reshape unable to handle this case? What is the right way of replicating Theano's flatten(ndim=n) in TensorFlow with tensors of rank 4 and more?

P-Gn · Accepted Answer

It is not a flaw in reshape, but a limitation of tf.dynamic_rnn.

Your code to flatten the last two dimensions is correct. And, reshape behaves correctly too: if the last two dimensions are unknown when you define the flattening operation, then so is their product, and None is the only appropriate value that can be returned at this time.

The culprit is tf.dynamic_rnn, which expects a fully-defined feature shape during construction, i.e. all dimensions apart from the first (batch size) and the second (time steps) must be known. It is a bit unfortunate perhaps, but the current implementation does not seem to allow RNNs with a variable number of features, à la FCN.

Nipun Wijerathne · Answer

I tried a simple code according to your requirements. Since you are trying to reshape a CNN output, the shape of X is same as the output of CNN in Tensorflow.

HEIGHT = 100
WIDTH  = 200
N_CHANELS =3

N_HIDDEN =64

X = tf.placeholder(tf.float32, shape=[None,HEIGHT,WIDTH,N_CHANELS],name='input') # output of CNN

shape = X.get_shape().as_list() # get the shape of each dimention shape[0] =BATCH_SIZE , shape[1] = HEIGHT , shape[2] = HEIGHT = WIDTH , shape[3] = N_CHANELS

input = tf.reshape(X, [-1, shape[1] , shape[2] * shape[3]])
print(input.shape) # prints (?, 100, 600)

#Input for tf.nn.dynamic_rnn should be in the shape of [BATCH_SIZE, N_TIMESTEPS, INPUT_SIZE]     

#Therefore, according to the reshape N_TIMESTEPS = 100 and INPUT_SIZE= 600

#create the RNN here
lstm_layers = tf.contrib.rnn.BasicLSTMCell(N_HIDDEN, forget_bias=1.0)
outputs, _ = tf.nn.dynamic_rnn(lstm_layers, input, dtype=tf.float32)

Hope this helps.

Flattening two last dimensions of a tensor in TensorFlow

Tags:

python

tensorflow

theano

rnn

naktinis

2 Answers

P-Gn

Nipun Wijerathne

Recent Activity

Donate For Us

Flattening two last dimensions of a tensor in TensorFlow

Tags:

python

tensorflow

theano

rnn

naktinis

2 Answers

P-Gn

Nipun Wijerathne

Related questions

Recent Activity

Donate For Us