Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is there any difference between matmul and usual multiplication of tensors

I am confused between the multiplication between two tensors using * and matmul. Below is my code

import torch
torch.manual_seed(7)
features = torch.randn((2, 5))
weights = torch.randn_like(features)

here, i want to multiply weights and features. so, one way to do it is as follows

print(torch.sum(features * weights))

Output:

tensor(-2.6123)

Another way to do is using matmul

print(torch.mm(features,weights.view((5,2))))

but, here output is

tensor([[ 2.8089,  4.6439],
        [-2.3988, -1.9238]])

What i don't understand here is that why matmul and usual multiplication are giving different outputs, when both are same. Am i doing anything wrong here?

Edit: When, i am using feature of shape (1,5) both * and matmul outputs are same. but, its not the same when the shape is (2,5).

like image 612
InAFlash Avatar asked Nov 08 '18 06:11

InAFlash


People also ask

What is Matmul operation?

MatMul operation takes two tensors and performs usual matrix-matrix multiplication, matrix-vector multiplication or vector-matrix multiplication depending on argument shapes. Input tensors can have any rank >= 1.

Which operator is used to perform matrix multiplication on tensors?

Tensor Hadamard Product As with matrices, the operation is referred to as the Hadamard Product to differentiate it from tensor multiplication. Here, we will use the “o” operator to indicate the Hadamard product operation between tensors. In NumPy, we can multiply tensors directly by multiplying arrays.

What is Tensorflow Matmul?

This is nothing but the matrix multiplication, it is achieved by using "linalg. matmul" function available in tensorflow. It will returns the multiplication of matrix for e.g matrix "a" by matrix "b" will produce a * b.

What is NP Matmul?

The numpy. matmul() function returns the matrix product of two arrays. While it returns a normal product for 2-D arrays, if dimensions of either argument is >2, it is treated as a stack of matrices residing in the last two indexes and is broadcast accordingly.


1 Answers

When you use *, the multiplication is elementwise, when you use torch.mm it is matrix multiplication.

Example:

a = torch.rand(2,5)
b = torch.rand(2,5)
result = a*b 

result will be shaped the same as a or b i.e (2,5) whereas considering operation

result = torch.mm(a,b)

It will give a size mismatch error, as this is proper matrix multiplication (as we study in linear algebra) and a.shape[1] != b.shape[0]. When you apply the view operation in torch.mm you are trying to match the dimensions.

In the special case of the shape in some particular dimension being 1, it becomes a dot product and hence sum (a*b) is same as mm(a, b.view(5,1))

like image 174
Umang Gupta Avatar answered Oct 17 '22 07:10

Umang Gupta