So I have a monthly returns matrix, in the form of 1000x300. I would like to take the average values of the every 12 columns for each row in the returns matrix to give me annual return, which would eventually lead to a 1000x25 matrix.
How would I go about doing this in Matlab?
Through some quick searching, I believe I can use the reshape function somehow, but I am having trouble figuring out how to implement it in my code's loop.
So far, this is my attempt.
for i = 1:25
Strategy1.MeanReturn(:,i) = mean(Data.Return(:,i+1):Data.Return(:,i*12+1));
end
Fyi, the +1 is there because I am ignoring the first column of the matrix.
But this leads me to getting a singular NaN value.
You can stack the desired submatrices along the first dimension of a 3D array, then do the average along that dimension, and squeeze out the resulting singleton dimension:
x = rand(10,20); % example data. 1000x300 in your case
N = 4; % group size. 12 in your case
y = reshape(x.', N, size(x,2)/N, []);
result = squeeze(mean(y,1)).';
try this:
B = zeros(1000,25);
A = rand(1000,300);
for i = 1:25
B(:,i) = mean(A(:,(i-1)*12+1:i*12),2);
end
I just tested it with building a sum of ones and it worked.
Loops aren't always slow. In fact, tests performed by Mathworks has shown that the speed of loops has improved by 40% as a result of the new and improved Execution Engine (JIT)
The average performance improvement across all tests was 40%. Tests consisted of code that used a range of MATLAB products. Although not all applications ran faster with the redesign, the majority of these applications ran at least 10% faster in R2015b than in R2015a.
and
The performance benefit of JIT compilation is greatest when MATLAB code is executed additional times and can re-use the compiled code. This happens in common cases such as for-loops or when applications are run additional times in a MATLAB session
A quick benchmark of the three solutions:
%% bushmills answer, saved as bushmills.m
function B = bushmills(A,N)
B = zeros(size(A,1),size(A,2)/N);
for i = 1:size(A,2)/N
B(:,i) = mean(A(:,(i-1)*12+1:i*12),2);
end
end
A = rand(1000,300); N = 12;
%% Luis Mendo's answer:
lmendo = @(A,N) squeeze(mean(reshape(x.', N, size(x,2)/N, []))).';
%% Divakar's answer:
divakar = @(A,N) reshape(mean(reshape(A,size(A,1),N,[]),2),size(A,1),[]);
b = @() bushmills(A,N);
l = @() lmendo(A,N);
d = @() divakar(A,N);
sprintf('Bushmill: %d\nLuis Mendo: %d\nDivakar: %d', timeit(b), timeit(l), timeit(d))
ans =
Bushmill: 1.102774e-03
Luis Mendo: 1.611329e-03
Divakar: 1.888878e-04
sprintf('Relative to fastest approach:\nDivakar: %0.5f\nBushmill: %0.5f\nLuis Mendo: %0.5f', 1, tb/td, tl/td)
ans =
Relative to fastest approach:
Divakar: 1.00000
Bushmill: 5.34464
Luis Mendo: 10.73969
The loop approach (with pre-allocation) is approximately 40% faster than the squeeze(mean(reshape(...)))
solution. Divakar's solution beats both by a mile.
It might be different for other values of A
and N
, but I haven't tested all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With