Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras dot/Dot layer behavior on 3D tensors

The Keras documentation for the dot/Dot layer states that:

"Layer that computes a dot product between samples in two tensors.

E.g. if applied to a list of two tensors a and b of shape (batch_size, n), the output will be a tensor of shape (batch_size, 1) where each entry i will be the dot product between a[i] and b[i].

Arguments

axes: Integer or tuple of integers, axis or axes along which to take the dot product."

I am not getting this, here is a quick, reproducible example to demonstrate:

from keras.layers import Input, dot
input_a = Input(batch_shape=(99,45000,300))
input_b = Input(batch_shape=(99,45000,300))
element_wise_dot_product = dot([input_a,input_b], axes = -1)
print(input_a.get_shape(),input_b.get_shape(),element_wise_dot_product.get_shape()) 

Output: (99, 45000, 300) (99, 45000, 300) (99, 45000, 45000)

Why is the element wise dot product shape not (99,45000,1) ? What am I doing wrong and how can i fix it?

like image 776
Kate A Baumli Avatar asked Aug 02 '18 15:08

Kate A Baumli


1 Answers

The dot layer is performing a matrix multiplication along the last axis since these are 3D tensors not 2D. So the shape you get reflects that. What you are trying to do is take the product across the last columns of each of your inputs. You can instead, take the element wise product of the two inputs and then sum along the last axis. For example,

import keras.backend as K
import tensorflow as tf

K.sum(tf.multiply(input_a, input_b[:tf.newaxis]), axis=-1, keepdims=True)

If you need a keras only solution, you can use keras.layers.multiply instead of tf.multiply and use K.expand_dims instead of the broadcasting with tf.newaxis.

like image 56
modesitt Avatar answered Oct 15 '22 19:10

modesitt