Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I matrix-multiply two PyTorch quantized Tensors?

I am new to tensor quantization, and tried doing something as simple as

import torch
x = torch.rand(10, 3)
y = torch.rand(10, 3)

[email protected]

with PyTorch quantized tensors running on CPU. I thus tried

scale, zero_point = 1e-4, 2
dtype = torch.qint32
qx = torch.quantize_per_tensor(x, scale, zero_point, dtype)
qy = torch.quantize_per_tensor(y, scale, zero_point, dtype)

[email protected] # I tried...

..and got as error

RuntimeError: Could not run 'aten::mm' with arguments from the 'QuantizedCPUTensorId' backend. 'aten::mm' is only available for these backends: [CUDATensorId, SparseCPUTensorId, VariableTensorId, CPUTensorId, SparseCUDATensorId].

Is matrix multiplication just not supported, or am I doing something wrong?

like image 390
Davide Fiocco Avatar asked Feb 20 '20 17:02

Davide Fiocco


People also ask

How do you multiply two tensors PyTorch?

mul() method is used to perform element-wise multiplication on tensors in PyTorch. It multiplies the corresponding elements of the tensors. We can multiply two or more tensors. We can also multiply scalar and tensors.

What is batch matrix multiplication PyTorch?

PyTorch bmm is used for matrix multiplication in batches where the scenario involves that the matrices to be multiplied have the size of 3 dimensions that is x, y, and z and the dimension of the first dimension for matrices to be multiplied should be the same.

What is a quantized tensor?

A Quantized Tensor allows for storing quantized data (represented as int8/uint8/int32) along with quantization parameters like scale and zero_point. Quantized Tensors allow for many useful operations making quantized arithmetic easy, in addition to allowing for serialization of data in a quantized format.


1 Answers

It is not straight forward to implement matrix multiplication for quantized matrices. Therefore, the "conventional" matrix multiplication (@) does not support it (as your error message suggests).

You should look at quantized operations, e.g., torch.nn.quantized.functional.linear:

torch.nn.quantized.functional.linear(qx[None,...], qy.T)
like image 126
Shai Avatar answered Oct 21 '22 16:10

Shai