Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to concatenate two tensors having different shape with TensorFlow?

Hello I'm new with TensorFlow and I'd like to concatenate a 2D tensor to a 3D one. I don't know how to do it by exploiting TensorFlow functions.

tensor_3d = [[[1,2], [3,4]], [[5,6], [7,8]]]  # shape (2, 2, 2)
tensor_2d = [[10,11], [12,13]]                # shape (2, 2)

out: [[[1,2,10,11], [3,4,10,11]], [[5,6,12,13], [7,8,12,13]]]  # shape (2, 2, 4)

I would make it work by using loops and new numpy arrays, but in that way I wouldn't use TensorFlow transformations. Any suggestions on how to make this possible? I don't see how transformations like: tf.expand_dims or tf.reshape may help here...

Thanks for sharing your knowledge.

like image 503
Matt Avatar asked Dec 17 '22 21:12

Matt


2 Answers

This should do the trick:

import tensorflow as tf

a = tf.constant([[[1,2], [3,4]], [[5,6], [7,8]]]) 
b = tf.constant([[10,11], [12,13]])

c = tf.expand_dims(b, axis=1) # Add dimension
d = tf.tile(c, multiples=[1,2,1]) # Duplicate in this dimension
e = tf.concat([a,d], axis=-1) # Concatenate on innermost dimension

with tf.Session() as sess:
    print(e.eval())

Gives:

[[[ 1  2 10 11]
[ 3  4 10 11]]

[[ 5  6 12 13]
[ 7  8 12 13]]]
like image 82
sdcbr Avatar answered Jan 23 '23 06:01

sdcbr


There is actually a different trick, that is used from time to time in code bases such as OpenAI's baselines.

Suppose you have two tensors for your gaussian policy. mu and std. The standard deviation has the same shape as mu for batch size 1, but because you use the same parameterized standard deviation for all actions, when the batch size is larger than 1 the two would differ:

mu : Size<batch_size, feat_n>
std: Size<1, feat_n>

In this case a simple thing to do (as what the OpenAI baseline does) is to do:

params = tf.concat([mu, mu * 0 + std])

The zero multiplication casts the std into the same rank as mu.

Enjoy, and good luck training!

ps: numpy and tensorflow's concat operator does not automagically apply broadcasting because according to the maintainers, when the shape of two tensors doesn't match, it is usually the result of a programming error. This is not a big deal in numpy because the computation are evaluated eagerly. But with tensorflow this means that you have to explicitly broadcast the lower rank (or the one that has shape [1, *_]) by hand using the tf.shape operator.

like image 20
episodeyang Avatar answered Jan 23 '23 05:01

episodeyang