Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add 2 tensors with different rank

I have 2 tensors: A with shape of (None, 16, 7, 7, 1024) and B with shape of (1, 16, 7, 7, 1024). I add these tensors using keras.layers.add([A, B]). I expect to have a tensor with shape of (None, 16, 7, 7, 1024) but I got (1, 16, 7, 7, 1024) ==> notice that batch size now becomes 1. How to get the result as I want (None)?

Code:

_h_state = np.zeros((16, 7, 7, 1024))
h_state = Input(tensor=tf.constant(_h_state, dtype=tf.float32), name='input_h_state')
enc = encoder.output

enc_x = Conv3D(filters=256, kernel_size=(1, 1, 1), strides=(1, 1, 1), name='enc_conv')(enc)
h_state_expanded = Lambda(lambda x: K.expand_dims(x, 0))(h_state)
h_state_x = Conv3D(filters=256, kernel_size=(1, 1, 1), strides=(1, 1, 1), name='h_state_conv')(h_state_expanded)
x = layers.add([enc_x, h_state_x])
x = Activation('tanh')(x)
.
.
.

Plot: enter image description here

like image 444
donto Avatar asked Jan 20 '26 08:01

donto


1 Answers

When you print x.shape, the output is (None, 16, 7, 7, 1024), but interestingly both plot_model and model.summary show the "unbroadcast" first dimension.

I believe you are right - the method keras.layers._Merge.compute_output_shape might not be handling broadcasting correctly for the first dimension in this particular case. That is something that should probably be fixed via a pull request.

In the meantime, you can instead use:

x = Lambda(lambda x: x[0] + x[1])([enc_x, h_state_x])

which gives the expected output shape.

like image 61
rvinas Avatar answered Jan 22 '26 21:01

rvinas