I was programming something in MATLAB and, as recommended, I am always trying to use vectorization. But in the end the program was quite slow. So I found out that in one place the code is significantly faster when using loops (example below).
I would like to know if I misinterpreted something or did something wrong, because performance is important in this case, and I don't want to keep guessing if vectorization or loops are going to be faster.
% data initialization
k = 8;
n = 2^k+1;
h = 1/(n-1);
cw = 0.1;
iter = 10000;
uloc = zeros(n);
fploc = uloc;
uloc(2:end-1,2:end-1) = 1;
vloc = uloc;
ploc = ones(n);
uloc2 = zeros(n);
fploc2 = uloc2;
uloc2(2:end-1,2:end-1) = 1;
vloc2 = uloc2;
ploc2 = ones(n);
%%%%%%%%%%%%%%%%%%%%%%
% vectorized version %
%%%%%%%%%%%%%%%%%%%%%%
tic
for it=1:iter
il=2:4;
jl=2:4;
fploc(il,jl) = h/6*(-uloc(il-1,jl-1) + uloc(il-1,jl)...
-2*uloc(il,jl-1)+2*uloc(il,jl+1)...
-uloc(il+1,jl) + uloc(il+1,jl+1)...
...
-vloc(il-1,jl-1) - 2*vloc(il-1,jl)...
+vloc(il,jl-1) - vloc(il,jl+1)...
+ 2*vloc(il+1,jl) + vloc(il+1,jl+1))...
...
+cw*h^2*(-ploc(il-1,jl)-ploc(il,jl-1)+4*ploc(il,jl)...
-ploc(il+1,jl)-ploc(il,jl+1));
end
toc
%%%%%%%%%%%%%%%%%%%%%%
% loop version %
%%%%%%%%%%%%%%%%%%%%%%
tic
for it=1:iter
for il=2:4
for jl=2:4
fploc2(il,jl) = h/6*(-uloc2(il-1,jl-1) + uloc2(il-1,jl)...
-2*uloc2(il,jl-1)+2*uloc2(il,jl+1)...
-uloc2(il+1,jl) + uloc2(il+1,jl+1)...
...
-vloc2(il-1,jl-1) - 2*vloc2(il-1,jl)...
+vloc2(il,jl-1) - vloc2(il,jl+1)...
+ 2*vloc2(il+1,jl) + vloc2(il+1,jl+1))...
...
+cw*h^2*(-ploc2(il-1,jl)-ploc2(il,jl-1)+4*ploc2(il,jl)...
-ploc2(il+1,jl)-ploc2(il,jl+1));
end
end
end
toc
A major reason why vectorization is faster than its for loop counterpart is due to the underlying implementation of Numpy operations. As many of you know (if you're familiar with Python), Python is a dynamically typed language.
Vectorization is not always faster than for . Sometimes vectorization requires that you construct multiple intermediate arrays, with it taking time and memory to construct them. The computation of the arrays is not always straight-forward.
The process of revising loop-based, scalar-oriented code to use MATLAB matrix and vector operations is called vectorization.
MATLAB is designed to perform vector operations really quickly. MATLAB is an interpreted language, which is why loops are so slow in it. MATLAB sidesteps this issue by providing extremely fast (usually written in C, and optimized for the specific architecture) and well tested functions to operate on vectors.
I didn't go through your code, but the JIT compiler in recent versions of Matlab has improved to the point where the situation you're facing is quite common - loops can be faster than vectorized code. It is difficult to know in advance which will be faster, so the best approach is to write the code in the most natural fashion, profile it and then if there is a bottleneck, try switching from loops to vectorized (or the other way).
MATLAB's just in time compiler (JIT) has been improved significantly over the last couple years. And even though you are right that one should generally vectorize code, from my experience this is only true for certain operations and functions and also depends on how much data your functions are handling.
The best way for you to find out what works best, is to profile your MATLAB code with and without vectorization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With