I have some data represented by input_x. It is a tensor of unknown size (should be inputted by batch) and each item there is of size n. input_x undergoes tf.nn.embedding_lookup, so that embed now has dimensions [?, n, m] where m is the embedding size and ? refers to the unknown batch size.
This is described here:
input_x = tf.placeholder(tf.int32, [None, n], name="input_x") embed = tf.nn.embedding_lookup(W, input_x) I'm now trying to multiply each sample in my input data (which is now expanded by embedding dimension) by a matrix variable, U, and I can't seem to get how to do that.
I first tried using tf.matmul but it gives an error due to mismatch in shapes. I then tried the following, by expanding the dimension of U and applying batch_matmul (I also tried the function from tf.nn.math_ops., the result was the same):
U = tf.Variable( ... ) U1 = tf.expand_dims(U,0) h=tf.batch_matmul(embed, U1) This passes the initial compilation, but then when actual data is applied, I get the following error:
In[0].dim(0) and In[1].dim(0) must be the same: [64,58,128] vs [1,128,128]
I also know why this is happening - I replicated the dimension of U and it is now 1, but the minibatch size, 64, doesn't fit.
How can I do that matrix multiplication on my tensor-matrix input correctly (for unknown batch size)?
Previous answers are obsolete. Currently tf.matmul() support tensors with rank > 2:
The inputs must be matrices (or tensors of rank > 2, representing batches of matrices), with matching inner dimensions, possibly after transposition.
Also tf.batch_matmul() was removed and tf.matmul() is the right way to do batch multiplication. The main idea can be understood from the following code:
import tensorflow as tf batch_size, n, m, k = 10, 3, 5, 2 A = tf.Variable(tf.random_normal(shape=(batch_size, n, m))) B = tf.Variable(tf.random_normal(shape=(batch_size, m, k))) tf.matmul(A, B) Now you will receive a tensor of the shape (batch_size, n, k). Here is what is going on here. Assume you have batch_size of matrices nxm and batch_size of matrices mxk. Now for each pair of them you calculate nxm X mxk which gives you an nxk matrix. You will have batch_size of them.
Notice that something like this is also valid:
A = tf.Variable(tf.random_normal(shape=(a, b, n, m))) B = tf.Variable(tf.random_normal(shape=(a, b, m, k))) tf.matmul(A, B) and will give you a shape (a, b, n, k)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With