Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply function to rolling window

Tags:

vector

matlab

Say I have a long list A of values (say of length 1000) for which I want to compute the std in pairs of 100, i.e. I want to compute std(A(1:100)), std(A(2:101)), std(A(3:102)), ..., std(A(901:1000)).

In Excel/VBA one can easily accomplish this by writing e.g. =STDEV(A1:A100) in one cell and then filling down in one go. Now my question is, how could one accomplish this efficiently in Matlab without having to use any expensive for-loops.


edit: Is it also possible to do this for a list of time series, e.g. when A has dimensions 1000 x 4 (i.e. 4 time series of length 1000)? The output matrix should then have dimensions 901 x 4.

like image 580
Phil-ZXX Avatar asked Jan 21 '15 08:01

Phil-ZXX


2 Answers

Note: For the fastest solution see Luis Mendo's answer

So firstly using a for loop for this (especially if those are your actual dimensions) really isn't going to be expensive. Unless you're using a very old version of Matlab, the JIT compiler (together with pre-allocation of course) makes for loops inexpensive.

Secondly - have you tried for loops yet? Because you should really try out the naive implementation first before you start optimizing prematurely.

Thirdly - arrayfun can make this a one liner but it is basically just a for loop with extra overhead and very likely to be slower than a for loop if speed really is your concern.

Finally some code:

n = 1000;
A = rand(n,1);
l = 100;

for loop (hardly bulky, likely to be efficient):

S = zeros(n-l+1,1);  %//Pre-allocation of memory like this is essential for efficiency!
for t = 1:(n-l+1)
    S(t) = std(A(t:(t+l-1)));
end

A vectorized (memory in-efficient!) solution:

[X,Y] = meshgrid(1:l)
S = std(A(X+Y-1))

A probably better vectorized solution (and a one-liner) but still memory in-efficient:

S = std(A(bsxfun(@plus, 0:l-1, (1:l)')))

Note that with all these methods you can replace std with any function so long as it is applies itself to the columns of the matrix (which is the standard in Matlab)


Going 2D:

To go 2D we need to go 3D

n = 1000;
k = 4;
A = rand(n,k);
l = 100;

ind = bsxfun(@plus, permute(o:n:(k-1)*n, [3,1,2]), bsxfun(@plus, 0:l-1, (1:l)'));    %'
S = squeeze(std(A(ind)));
M = squeeze(mean(A(ind)));
%// etc...

OR

[X,Y,Z] = meshgrid(1:l, 1:l, o:n:(k-1)*n);
ind = X+Y+Z-1;
S = squeeze(std(A(ind)))
M = squeeze(mean(A(ind)))
%// etc...

OR

ind = bsxfun(@plus, 0:l-1, (1:l)');                                                  %'
for t = 1:k
    S = std(A(ind));
    M = mean(A(ind));
    %// etc...
end

OR (taken from Luis Mendo's answer - note in his answer he shows a faster alternative to this simple loop)

S = zeros(n-l+1,k);
M = zeros(n-l+1,k);
for t = 1:(n-l+1)
    S(t,:) = std(A(k:(k+l-1),:));
    M(t,:) = mean(A(k:(k+l-1),:));
    %// etc...
end
like image 199
Dan Avatar answered Oct 20 '22 16:10

Dan


What you're doing is basically a filter operation.

If you have access to the image processing toolbox,

stdfilt(A,ones(101,1)) %# assumes that data series are in columns

will do the trick (no matter the dimensionality of A). Note that if you also have access to the parallel computing toolbox, you can let filter operations like these run on a GPU, although your problem might be too small to generate noticeable speedups.

like image 7
Jonas Avatar answered Oct 20 '22 14:10

Jonas