Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to measure the gflops of a matrix multiplication kernel?

In the book Programming Massively Parallel Processors the number of gflops is used to compare the efficiency of different matrix multiplication kernels. How would I compute this for my own kernels on my own machine?

Somewhere in the NVIDIA Forums I found this 'algorithm', but I don't know, how valid it is or where the times two comes from.

NumOps = 2 * pow(MatrixSize,3)
gflops = 1.0e-9 * NumOps / ExecutionTime

p.s. please feel free to change the tags...

like image 689
Framester Avatar asked Jul 29 '11 12:07

Framester


1 Answers

You can measure the GFLOPs by running the algorithm with a large input and measuring the execution time. Then put the execution time and matrix size into that formula. For matrix sizes big enough to keep the entire machine busy, the FLOPs is only weakly dependent on matrix size.

The GPU matrix multiplication algorithm performs the same number of floating-point operations as the naive algorithm.

for (i = 0; i < MatrixSize; i++)
  for (j = 0; j < MatrixSize; j++)
    for (k = 0; k < MatrixSize; k++)
      C[j][i] += A[j][k] * B[k][i];

There are 2 floating-point operations in the loop body, and MatrixSize * MatrixSize * MatrixSize iterations of the loop body, which gives you the formula for NumOps. GFLOPs is just the number of operations per second, divided by 10^9 ('giga').

like image 83
Heatsink Avatar answered Sep 27 '22 00:09

Heatsink