Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Doing multiple matrix-matrix multiplications in one operation

Tags:

c++

c

cuda

blas

cublas

I'm implementing an algorithm that, in essence, is a series of matrix-matrix multiplications like this:

Res = M1.M2.M3. ... .Mn

My matrices are really small 100x100 floats, but the sequence is really long, in the order of billions.

I tried using CUBLAS to to the matrix multiplications but this was slow, I did however notice something interesting.

multiplying a 100x100 with a 100x100 matrix was slow, but multiplying a 1.000.000x100 with a 100x100 was relatively fast, this made me think .If I instead of having a scan from left to right had 10.000 scans in parallel. This should be pretty fast, and if I multiplied my matrices when I was done with this, I would get the same result -- just faster.

Res1 = M1.M2.M3. ... .Mn/1000-1
Res1 = M1+n/1000.M2+n/1000.M3+n/1000. ... .M2(n/1000)-1
...
Res1  = M1+999*n/1000.M2+999*n/1000.M3+999*n/1000. ... .M1000*(n/1000)-1
Res = Res1*Res2* ... *Res999

Its worth nothing that M_1 ... M_n are in a set of about 100 different matrices, so space consumption isn't really a problem, all I need to to is be to do multiple multiplies in one operation.

Now here is my problem. I've done a matrix-matrix(sgemm) implementation inspired by the one nvidia demonstrates in their documentation but it is an order of about 4 times as slow as cublas. Do anyone know how CUBLAS works? And if the code is available somewhere?

like image 786
Martin Kristiansen Avatar asked Feb 09 '12 18:02

Martin Kristiansen


People also ask

How do you multiply multiple matrices?

To multiply one matrix with another, we need to check first, if the number of columns of the first matrix is equal to the number of rows of the second matrix. Now multiply each element of the column of the first matrix with each element of rows of the second matrix and add them all.

Can we multiply 2 * 3 matrix by 2 * 3?

Matrix Multiplication (3 x 2) and (2 x 3)Multiplication of 3x2 and 2x3 matrices is possible and the result matrix is a 3x3 matrix.

Can you multiply a 2x3 matrix and a 2x2 matrix?

For example, the 2 × 2 and 2 × 3 matrices of multiplication are possible and the resultant matrix is a 2 × 3 matrix.


1 Answers

Have you looked at the latest CUBLAS (version 4.1)? It includes a new batched GEMM mode specifically intended for large batches of small matrix-matrix multiplies. I would suggest doing a pairwise multiplication tree as Jonathan Dursi suggested in his answer, using the CUBLAS batched API to accelerate it, rather than writing your own custom kernel as he suggests.

CUBLAS 4.1 is included with the CUDA Toolkit v4.1.

CUBLAS BATCHED GEMM API IMPROVES PERFORMANCE OF BATCHES OF SMALL MATRICES

like image 123
harrism Avatar answered Sep 29 '22 22:09

harrism