Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError when performing matmul with Tensorflow

Tags:

tensorflow

I'm a total beginner to TensorFlow, and I'm trying to multiply two matrices together, but I keep getting an exception that says:

ValueError: Shapes TensorShape([Dimension(2)]) and TensorShape([Dimension(None), Dimension(None)]) must have the same rank

Here's minimal example code:

data = np.array([0.1, 0.2])
x = tf.placeholder("float", shape=[2])
T1 = tf.Variable(tf.ones([2,2]))
l1 = tf.matmul(T1, x)
init = tf.initialize_all_variables()

with tf.Session() as sess:
    sess.run(init)
    sess.run(feed_dict={x: data}

Confusingly, the following very similar code works fine:

data = np.array([0.1, 0.2])
x = tf.placeholder("float", shape=[2])
T1 = tf.Variable(tf.ones([2,2]))
init = tf.initialize_all_variables()

with tf.Session() as sess:
    sess.run(init)
    sess.run(T1*x, feed_dict={x: data}

Can anyone point to what the issue is? I must be missing something obvious here..

like image 515
homesalad Avatar asked Jan 20 '16 18:01

homesalad


People also ask

What is Tensorflow Matmul?

Multiplies matrix a by matrix b , producing a * b . The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication arguments, and any further outer dimensions match.

How do you use NP Matmul?

The numpy matmul() function takes arr1 and arr2 as arguments and returns the matrix product of the input arrays. To multiply two arrays in Python, use the np. matmul() method. In the case of 2D matrices, a regular matrix product is returned.

How do you multiply tensors?

Multiply two or more tensors using torch. mul() and assign the value to a new variable. You can also multiply a scalar quantity and a tensor. Multiplying the tensors using this method does not make any change in the original tensors.


1 Answers

The tf.matmul() op requires that both of its inputs are matrices (i.e. 2-D tensors)*, and doesn't perform any automatic conversion. Your T1 variable is a matrix, but your x placeholder is a length-2 vector (i.e. a 1-D tensor), which is the source of the error.

By contrast, the * operator (an alias for tf.multiply()) is a broadcasting element-wise multiplication. It will convert the vector argument to a matrix by following NumPy broadcasting rules.

To make your matrix multiplication work, you can either require that x is a matrix:

data = np.array([[0.1], [0.2]])
x = tf.placeholder(tf.float32, shape=[2, 1])
T1 = tf.Variable(tf.ones([2, 2]))
l1 = tf.matmul(T1, x)
init = tf.initialize_all_variables()

with tf.Session() as sess:
    sess.run(init)
    sess.run(l1, feed_dict={x: data})

...or you could use the tf.expand_dims() op to convert the vector to a matrix:

data = np.array([0.1, 0.2])
x = tf.placeholder(tf.float32, shape=[2])
T1 = tf.Variable(tf.ones([2, 2]))
l1 = tf.matmul(T1, tf.expand_dims(x, 1))
init = tf.initialize_all_variables()

with tf.Session() as sess:
    # ...

* This was true when I posted the answer at first, but now tf.matmul() also supports batched matrix multiplications. This requires both arguments to have at least 2 dimensions. See the documentation for more details.

like image 187
mrry Avatar answered Oct 07 '22 12:10

mrry