Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looping statements performance and pre-allocating the looping statement itself

This observation is not that important, because the time performance wasted on the loop statements will probably be much higher than the looping itself. But anyway, I will share it since I searched and couldn't find a topic about this. I always had this impression that pre-allocating the array I would loop, and then loop on it, would be better than looping directly on it, and decided to check it. The code would be to compare the efficiency between this two fors:

disp('Pure for with column on statement:')
tic
for k=1:N
end
toc

disp('Pure for with column declared before statement:')
tic
m=1:N;
for k=m
end
toc

But the results I got are:

Pure for with column on statement:
Elapsed time is 0.003309 seconds.
Pure for with column declared before statement:
Elapsed time is 0.208744 seconds.

Why the hell is that? Shouldn't pre-allocating be faster?

In fact, the matlab help for says:

Long loops are more memory efficient when the colon expression appears in the FOR statement since the index vector is never created.

So, contradicting my expectations the column expression at the for statement is better, because it does not allocate the vector and, because of that, is faster.

I made the following script to test other occasions that I also would think that would be faster:

% For comparison:
N=1000000;

disp('Pure for loop on cell declared on statement:')
tic
for k=repmat({1},1,N)
end
toc

disp('Pure for loop on cell declared before statement:')
tic
mcell=repmat({1},1,N);
for k=mcell
end
toc

disp('Pure for loop calculating length on statement:')
tic 
for k=1:length(mcell)
end
toc

disp('Pure for loop calculating length before statement:')
tic
lMcell = length(mcell);
for k=1:lMcell
end
toc

disp('Pure while loop using le:')
% While comparison:
tic
k=1;
while (k<=N)
  k=k+1;
end
toc

disp('Pure while loop using lt+1:')
% While comparison:
tic
k=1;
while (k<N+1)
  k=k+1;
end
toc


disp('Pure while loop using lt+1 pre allocated:')
tic
k=1;
myComp = N+1;
while (k<myComp)
  k=k+1;
end
toc

And the timings are:

Pure for loop on cell declared on statement:
Elapsed time is 0.259250 seconds.
Pure for loop on cell declared before statement:
Elapsed time is 0.260368 seconds.
Pure for loop calculating length on statement:
Elapsed time is 0.012132 seconds.
Pure for loop calculating length before statement:
Elapsed time is 0.003027 seconds.
Pure while loop using le:
Elapsed time is 0.005679 seconds.
Pure while loop using lt+1:
Elapsed time is 0.006433 seconds.
Pure while loop using lt+1 pre allocated:
Elapsed time is 0.005664 seconds.

Conclusions:

  • You can gain a bit of performance just by loop on comma statements, but that can be negligible comparing to the time spent on the for-loop.
  • For cells the difference seems to be negligible.
  • It is better to pre-allocate length before doing the loop.
  • The while has the same efficiency as the for without pre-allocating the vector, which makes sense as stated before
  • As expected, it is better to calculate fixed expressions before the while statement.

But the question that I can't answer is, what about the cell, why isn't there time difference? The overhead could be much lesser than the observed? Or it has to allocate the cells since it is not a basic type as a double?

If you know other tricks concerning this topic fill free to add.


Just adding the timings to show the results of turning feature('accel','off') as said in @Magla's answer.

Pure for with column on statement:
Elapsed time is 0.181592 seconds.
Pure for with column declared before statement:
Elapsed time is 0.180011 seconds.
Pure for loop on cell declared on statement:
Elapsed time is 0.242995 seconds.
Pure for loop on cell declared before statement:
Elapsed time is 0.228705 seconds.
Pure for loop calculating length on statement:
Elapsed time is 0.178931 seconds.
Pure for loop calculating length before statement:
Elapsed time is 0.178486 seconds.
Pure while loop using le:
Elapsed time is 1.138081 seconds.
Pure while loop using lt+1:
Elapsed time is 1.241420 seconds.
Pure while loop using lt+1 pre allocated:
Elapsed time is 1.162546 seconds.

The results now area as expected…

like image 563
Werner Avatar asked Oct 21 '22 02:10

Werner


1 Answers

This finding has nothing to do with preallocating or not: it deals with matlab being enable or not to compute things with several cores. When you insert the colon operator within the for statement, it tells matlab to use several cores (i.e. multithreading).

If you set matlab on one core only with feature('accel','off'), the observed difference with doubles vanishes. Concerning cells, matlab does not make use of multithreading - therefore no difference can be observed (whatever the status of accel is).

The for loop is multithreaded when a colon is used, and only if a colon is used. The following vectors of similar length does not engage several cores:

  • for k = randperm(N)
  • for k = linspace(1,N,N)

but for k = 1:0.9999:N is multithreaded.

One explanation can be found on this matlab's support page. It states that multi-core processing can be done when "The operations in the algorithm carried out by the function are easily partitioned into sections that can be executed concurrently.". With a colon operator, Matlab knows that for can be partitioned.

like image 129
marsei Avatar answered Oct 24 '22 02:10

marsei