Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to bechmark a method in Octave?

Matlab has timeit method which is helpful to compare the performance of an implementation with another. I couldn't find something similar in octave. I wrote this benchmark method with runs a function f N times and then returns the total time taken. Is this a reasonable way to compare different implementations or am I missing something critical like "warmup"?

function elapsed_time_in_seconds = benchmark(f, N)
  % benchmark runs the function 'f' N times and returns the elapsed time in seconds.

  timeid = tic;
  for i=1:N
    output = f();
  end
  elapsed_time_in_seconds = toc(timeid);
end

like image 449
Turbo Avatar asked Jan 18 '21 23:01

Turbo


1 Answers

MATLAB's timeit does the following (you can read the whole function, it's an M-file):

  1. Obtain a rough estimate t_rough of the time for calling the function f.
  2. Use the estimate to determine N such that N*t_rough is about 0.001 s.
  3. Determine M such that M*N*t_rough is no more than 15 s, but M must be between 3 and 11.
  4. Loop M times:
  5.    Call f() N times and record the total time.
  6. Determine the median of the M times, divided by N.

The purpose of the two loops, M and N, is as follows: Calling f() N times ensures that the time measured by tic/toc is sufficiently large to be reliable, this loop avoids attempting to time something that is so short that it cannot be timed. Repeating the measurement M times and keeping the median attempts to make the measurement robust against delays caused by other stuff happening on your system, which can artificially inflate the recorded time.

The function subtracts the overhead of calling a function through its handle (determined experimentally by timing the call of an empty function), as well as the tic/toc call time (also determined experimentally). It does not subtract the cost of the inner loop, presumably because in MATLAB it is optimized by the JIT and its cost is negligible.

There are some further refinements. The function that determines t_rough first warms up tic and toc by calling each one twice, then it uses a while loop to ensure it calls f() for at least 0.001 s. But in this loop, if the first iteration takes at least 3 s, it just takes that time as the rough estimate. If the first iteration takes less time, the first time count is discarded (warmup), and then uses the median of all the subsequent calls as the rough estimate of the time.

There's also a lot of effort put into calling the function f() with the right number of output arguments.

The code has a lot of comments explaining the reason behind all these steps, it is worth reading.


As a minimum, I would augment your benchmark function as follows:

function elapsed_time_in_seconds = benchmark(f, N, M)
  % benchmark runs the function 'f' N*M times and returns the elapsed time in seconds.

  tic; [~] = toc; tic; [~] = toc; % warmup
  output = f(); % warmup

  t = zeros(M, 1);
  for k=1:M
    timeid = tic;
    for i=1:N
      output = f();
    end
    t(k) = toc(timeid) / N;
  end
  elapsed_time_in_seconds = median(t);
end

If you use the function to directly compare various alternatives, keeping N and M constant, then the overheads of tic, toc, function calls and loops is irrelevant.

This function does assume that f has one output argument, which is not necessarily the case. You could just call f() instead of output = f(), which will work for functions with or without output arguments. But if the function needs to have a certain number of outputs to work correctly, or to trigger computations that you want to time, then you'd have to adjust the function to call it with the right number of output arguments.

You could come up with some heuristic to determine M from N, which would make it a little easier to use this function.

like image 99
Cris Luengo Avatar answered Sep 20 '22 10:09

Cris Luengo