I'm implementing RBM in tensorflow.
and there is an obstacle in implementing parameters update using mini-batch
there are 2 tensors
1st tensor's shape is [100,3,1] 2nd tensor's shape is [100,1,4]
number 100 is size of batch.
so i want to multiply these tensor which results in [100,3,4] tensor.
but when i implement code like
tf.tensordot(1st_tensor,2nd_tensor,[[2],[1]])
resulting tensor' shape is [100,3,100,4]
how do i solve this problem?
I'm not sure if you're still facing this issue (as it's been a month) but I resolved the same issue by using tf.tensordot
and tf.map_fn
, which accepts nested input elements and parallelizes a function across the first (usually, batch) dimension. The following function performs a batch-parallel matrix multiplication across the final two dimensions of your tensors of arbitrary rank (as long as the last two axes match for the purposes of matrix multiplication):
def matmul_final_two_dims(tensor1, tensor2):
# set this to the appropriate value, as map_fn seems to have
# some dtype inference difficulties:
_your_dtype_here = tf.float64
return tf.map_fn(lambda xy: tf.tensordot(xy[0], xy[1], axes=[[-1], [-2]]),
elems=(tensor1, tensor2), dtype=_your_dtype_here)
Example usage:
>> batchsize = 3
>> tensor1 = np.random.rand(batchsize,3,4,5,2) # final dims [5,2]
>> tensor2 = np.random.rand(batchsize,2,3,2,4) # final dims [2,4]
>> sess.run(tf.shape(matmul_final_two_dims(tensor1, tensor2)))
array([3, 3, 4, 5, 2, 3, 4], dtype=int32)
>> matmul_final_two_dims(tensor1,tensor2)
<tf.Tensor 'map_1/TensorArrayStack/TensorArrayGatherV3:0' shape=(3, 3, 4, 5, 2, 3, 4) dtype=float64>
Note in particular that the first dimension of the output is the correct batch size and the final 2
in the shape is tensor-contracted out. You will have to do some sort of tf.transpose
operation to get the dimension-5
index in the right place, though, as the indices of the output matrix are ordered as they appear in the input tensors.
I'm using TFv1.1. tf.map_fn
can be parallelized but I'm not sure if the above is the most efficient implementation. For reference:
tf.tensordot API
tf.map_fn API
EDIT: the above was what worked for me, but I think you can also use an einsum
(docs here) to accomplish what you want:
>> tensor1 = tf.constant(np.random.rand(3,4,5))
>> tensor2 = tf.constant(np.random.rand(3,5,7))
>> tf.einsum('bij,bjk->bik', tensor1, tensor2)
<tf.Tensor 'transpose_2:0' shape=(3, 4, 7) dtype=float64>
You can use tf.keras.backend.batch_dot instead; it expects the first dimension to be batch_size
, and should do what you want it to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With