Logo Questions Linux Laravel Mysql Ubuntu Git Menu

obstacles in tensorflow's tensordot using batch multiplication

I'm implementing RBM in tensorflow.

and there is an obstacle in implementing parameters update using mini-batch

there are 2 tensors

1st tensor's shape is [100,3,1] 2nd tensor's shape is [100,1,4]

number 100 is size of batch.

so i want to multiply these tensor which results in [100,3,4] tensor.

but when i implement code like


resulting tensor' shape is [100,3,100,4]

how do i solve this problem?

like image 548
bj1123 Avatar asked Mar 10 '23 01:03


2 Answers

I'm not sure if you're still facing this issue (as it's been a month) but I resolved the same issue by using tf.tensordot and tf.map_fn, which accepts nested input elements and parallelizes a function across the first (usually, batch) dimension. The following function performs a batch-parallel matrix multiplication across the final two dimensions of your tensors of arbitrary rank (as long as the last two axes match for the purposes of matrix multiplication):

def matmul_final_two_dims(tensor1, tensor2):
  # set this to the appropriate value, as map_fn seems to have
  # some dtype inference difficulties:
  _your_dtype_here = tf.float64
  return tf.map_fn(lambda xy: tf.tensordot(xy[0], xy[1], axes=[[-1], [-2]]),
                   elems=(tensor1, tensor2), dtype=_your_dtype_here)

Example usage:

>> batchsize = 3
>> tensor1 = np.random.rand(batchsize,3,4,5,2) # final dims [5,2]
>> tensor2 = np.random.rand(batchsize,2,3,2,4) # final dims [2,4]
>> sess.run(tf.shape(matmul_final_two_dims(tensor1, tensor2)))
array([3, 3, 4, 5, 2, 3, 4], dtype=int32)
>> matmul_final_two_dims(tensor1,tensor2)
<tf.Tensor 'map_1/TensorArrayStack/TensorArrayGatherV3:0' shape=(3, 3, 4, 5, 2, 3, 4) dtype=float64>

Note in particular that the first dimension of the output is the correct batch size and the final 2 in the shape is tensor-contracted out. You will have to do some sort of tf.transpose operation to get the dimension-5 index in the right place, though, as the indices of the output matrix are ordered as they appear in the input tensors.

I'm using TFv1.1. tf.map_fn can be parallelized but I'm not sure if the above is the most efficient implementation. For reference:

tf.tensordot API

tf.map_fn API

EDIT: the above was what worked for me, but I think you can also use an einsum (docs here) to accomplish what you want:

>> tensor1 = tf.constant(np.random.rand(3,4,5))
>> tensor2 = tf.constant(np.random.rand(3,5,7))
>> tf.einsum('bij,bjk->bik', tensor1, tensor2)
<tf.Tensor 'transpose_2:0' shape=(3, 4, 7) dtype=float64>
like image 103
ptsw Avatar answered Mar 23 '23 02:03


You can use tf.keras.backend.batch_dot instead; it expects the first dimension to be batch_size, and should do what you want it to do.

like image 31
The AI Architect Avatar answered Mar 23 '23 01:03

The AI Architect