how to avoid the loop to reduce the computation time of this code (one solution of my last question):
I hope to find the column vectors of A(1:3,:)
whose corresponding values in M(4,:)
are not part of one of the vectors of the cell X
(and obviously not equal to one of these vectors). I look for a fast solution if X
is very large.
M = [1007 1007 4044 1007 4044 1007 5002 5002 5002 622 622;
552 552 300 552 300 552 431 431 431 124 124;
2010 2010 1113 2010 1113 2010 1100 1100 1100 88 88;
7 12 25 15 12 30 2 10 55 32 12];
Here I take directly A
:
A = [1007 4044 5002 622;
552 300 431 124;
2010 1113 1100 88];
A
contains unique column vectors of M(1:3,:)
X = {[2 5 68 44],[2 10 55 9 17],[1 55 6 7 8 9],[32 12]};
[~, ~, subs] = unique(M(1:3,:)','rows');
A4 = accumarray(subs(:),M(4,:).',[],@(x) {x});
%// getting a mask of which columns we want
idxC(length(A4)) = false;
for ii = 1:length(A4)
idxC(ii) = ~any(cellfun(@(x) all(ismember(A4{ii},x)), X));
end
Displaying the columns we want
out = A(:,idxC)
Results:
>> out
out =
1007 4044
552 300
2010 1113
the column vector [5002;431;1100]
was eliminated because [2;10;55]
is contained in X{2} = [2 10 55 9 17]
the column vector [622;124;88]
was eliminated because [32 12] = X{4}
Another example: with the same X
M = [1007 4044 1007 4044 1007 5002 5002 5002 622 622 1007 1007 1007;
552 300 552 300 552 431 431 431 124 124 552 11 11;
2010 1113 2010 1113 2010 1100 1100 1100 88 88 2010 20 20;
12 25 15 12 30 2 10 55 32 12 7 12 7];
X = {[2 5 68 44],[2 10 55 9 17],[1 55 6 7 8 9],[32 12]};
A = [1007 4044 5002 622 1077;
552 300 431 124 11;
2010 1113 1100 88 20];
Results: (with scmg answer)
I get if A
sorted according to the first row: (correct result)
out =
1007 1007 4044
11 552 300
20 2010 1113
if I do not sort the matrix A
, I get: (false result)
out =
4044 5002 622
300 431 124
1113 1100 88
the column vector A(:,4) = [622;124;88]
should be eliminated because [32 12] = X{4}
.
the column vector [5002;431;1100]
should be eliminated because [2;10;55]
is contained in X{2} = [2 10 55 9 17]
Recursion. Another way we can avoid using imperative loops is through recursion. Recursion is simple. Have a function call itself (which creates a loop) and design an exit condition out of that loop.
Putting all together. Continuing on the challenge to reduce the iterations on our algorithm, we have to perform the following steps: build the "index" with the information to be accessed later. iterate over the loop and fetch the additional information from the previously created "index"
The best ways to improve loop performance are to decrease the amount of work done per iteration and decrease the number of loop iterations. Generally speaking, switch is always faster than if-else , but isn't always the best solution.
The answer of Ben Voigt is great, but the line for A4i = A4{ii}
is the one causing issues : the for loop doesn't work this way with column vectors :
%row vector
for i = 1:3
disp('foo');
end
foo
foo
foo
%column vector
for i = (1:3).'
disp('foo');
end
foo
Just try for A4i = A4{ii}.'
instead and it should get your work done!
Now, if we look at the output :
A(:,idxC) =
4044 5002
300 431
1113 1100
As you can see, the final result is not what we expected.
As long as unique
does a kind of sort, the subs are not numbered by the order of encounter in A, but by order of encounter in C (which is sorted) :
subs =
2
2
3
2
3
2
4
4
4
1
1
Therefore you should pass by the matrix given by unique
rather than A to get your final output
Enter
[C, ~, subs] = unique(M(1:3,:)','rows');
%% rather than [~, ~, subs] = unique(M(1:3,:)','rows');
Then, to get the final output, enter
>> out = C(idxC,:).'
out =
1007 4044
552 300
2010 1113
In this case, you should not be trying to eliminate loops. The vectorization is actually hurting you badly.
In particular (giving a name to your anonymous lambda)
issubset = @(x) all(ismember(A4{ii},x))
is ridiculously inefficient, because it doesn't short-circuit. Replace that with a loop.
Same for
any(cellfun(issubset, X))
Use an approach similar to this instead:
idxC = true(size(A4));
NX = numel(X);
for ii = 1:length(A4)
for jj = 1:NX
xj = X{jj};
issubset = true;
for A4i=A4{ii}
if ~ismember(A4i, xj)
issubset = false;
break;
end;
end;
if issubset
idxC(ii) = false;
break;
end;
end;
end;
The two break
statements, and especially the second one, trigger an early exit that potentially saves you a huge amount of computation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With