Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matlab: eliminate elements from array

I have quite big array. To make things simple lets simplify it to:

A = [1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5 5 5 5];

So, there is a group of 1's (4 elements), 2's (2 elements), 3's (4 elements), 4's (2 elements) and 5's (8 elements). Now, I want to keep only columns, which belong to group of 3 or more elements. So it will be like:

B = [1 1 1 1 3 3 3 3 5 5 5 5 5 5 5 5];

I was doing it using for loop, scanning separately 1's, 2's, 3's and so on, but its extremely slow with big arrays... Thanks for any suggestions how to do it in more efficient way :) Art.

like image 894
Art Avatar asked Sep 20 '12 11:09

Art


3 Answers

A general approach

If your vector is not necessarily sorted, then you need to run to count the number of occurrences of each element in the vector. You have histc just for that:

elem = unique(A);
counts = histc(A, elem);
B = A;
B(ismember(A, elem(counts < 3))) = []

The last line picks the elements that have less than 3 occurrences and deletes them.

An approach for a grouped vector

If your vector is "semi-sorted", that is if similar elements in the vector are grouped together (as in your example), you can speed things up a little by doing the following:

start_idx = find(diff([0, A]))
counts = diff([start_idx, numel(A) + 1]);
B = A;
B(ismember(A, A(start_idx(counts < 3)))) = []

Again, note that the vector need not to be entirely sorted, just that similar elements are adjacent to each other.

like image 57
Eitan T Avatar answered Oct 22 '22 09:10

Eitan T


Here is my two-liner

counts = accumarray(A', 1);
B = A(ismember(A, find(counts>=3)));

accumarray is used to count the individual members of A. find extracts the ones that meet your '3 or more elements' criterion. Finally, ismember tells you where they are in A. Note that A needs not be sorted. Of course, accumarray only works for integer values in A.

like image 26
angainor Avatar answered Oct 22 '22 10:10

angainor


What you are describing is called run-length encoding.

There is software for this in Matlab on the FileExchange. Or you can do it directly as follows:

len = diff([ 0 find(A(1:end-1) ~= A(2:end)) length(A) ]);
val = A(logical([ A(1:end-1) ~= A(2:end) 1 ]));

Once you have your run-length encoding you can remove elements based on the length. i.e.

idx = (len>=3)
len = len(idx);
val = val(idx);

And then decode to get the array you want:

i = cumsum(len);
j = zeros(1, i(end));
j(i(1:end-1)+1) = 1; 
j(1) = 1; 
B = val(cumsum(j));
like image 3
robince Avatar answered Oct 22 '22 08:10

robince