I have a matrix of 1s
and -1s
with randomly interspersed 0s
:
%// create matrix of 1s and -1s
hypwayt = randn(10,5);
hypwayt(hypwayt > 0) = 1;
hypwayt(hypwayt < 0) = -1;
%// create numz random indices at which to insert 0s (pairs of indices may
%// repeat, so final number of inserted zeros may be < numz)
numz = 15;
a = 1;
b = 10;
r = round((b-a).*rand(numz,1) + a);
s = round((5-1).*rand(numz,1) + a);
for nx = 1:numz
hypwayt(r(nx),s(nx)) = 0
end
Input:
hypwayt =
-1 1 1 1 1
1 -1 1 1 1
1 -1 1 0 0
-1 1 0 -1 1
1 -1 0 0 0
-1 1 -1 -1 -1
1 1 0 1 -1
0 1 -1 1 -1
-1 0 1 1 0
1 -1 0 -1 -1
I want to count how many times the nonzero
elements are repeated in a column, to produce something like this:
The basic idea is (provided by @rayryeng) For each column independently, every time you hit a unique number, you start incrementing a cumulative running counter and it increments every time you hit the same number as the previous one. As soon as you hit a new number, it gets reset to 1, except for the case when you hit a 0, and so that's 0
Expected Output:
hypwayt_runs =
1 1 1 1 1
1 1 2 2 2
2 2 3 0 0
1 1 0 1 1
1 1 0 0 0
1 1 1 1 1
1 2 0 1 2
0 3 1 2 3
1 0 1 3 0
1 1 0 1 1
What's the cleanest way to accomplish this?
As motivation made by Dev-IL, here's a solution using loops. Even though that the code is readable, I would argue that it's slow as you have to iterate through each element individually.
hypwayt = [-1 1 1 1 1;
1 -1 1 1 1;
1 -1 1 0 0;
-1 1 0 -1 1;
1 -1 0 0 0;
-1 1 -1 -1 -1;
1 1 0 1 -1;
0 1 -1 1 -1;
-1 0 1 1 0;
1 -1 0 -1 -1];
%// Initialize output array
out = ones(size(hypwayt));
%// For each column
for idx = 1 : size(hypwayt, 2)
%// Previous value initialized as the first row
prev = hypwayt(1,idx);
%// For each row after this point...
for idx2 = 2 : size(hypwayt,1)
% // If the current value isn't equal to the previous value...
if hypwayt(idx2,idx) ~= prev
%// Set the new previous value
prev = hypwayt(idx2,idx);
%// Case for 0
if hypwayt(idx2,idx) == 0
out(idx2,idx) = 0;
end
%// Else, reset the value to 1
%// Already done by initialization
%// If equal, increment
%// Must also check for 0
else
if hypwayt(idx2,idx) ~= 0
out(idx2,idx) = out(idx2-1,idx) + 1;
else
out(idx2,idx) = 0;
end
end
end
end
>> out
out =
1 1 1 1 1
1 1 2 2 2
2 2 3 0 0
1 1 0 1 1
1 1 0 0 0
1 1 1 1 1
1 2 0 1 2
0 3 1 2 3
1 0 1 3 0
1 1 0 1 1
There should be better way I suppose, but this should work
Using cumsum
,diff
,accumarray
& bsxfun
%// doing the 'diff' along default dim to get the adjacent equality
out = [ones(1,size(A,2));diff(A)];
%// Putting all other elements other than zero to 1
out(find(out)) = 1;
%// getting all the indexes of 0 elements
ind = find(out == 0);
%// doing 'diff' on indices to find adjacent indices
out1 = [0;diff(ind)];
%// Putting all those elements which are 1 to zero and rest to 1
out1 = 0.*(out1 == 1) + out1 ~= 1;
%// counting each unique group's number of elements
out1 = accumarray(cumsum(out1),1);
%// Creating a mask for next operation
mask = bsxfun(@le, (1:max(out1)).',out1.');
%// Doing colon operation from 2 to maxsize
out1 = bsxfun(@times,mask,(2:size(mask,1)+1).'); %'
%// Assign the values from the out1 to corresponding indices of out
out(ind) = out1(mask);
%// finally replace all elements of A which were zero to zero
out(A==0) = 0
Results:
Input:
>> A
A =
-1 1 1 1 1
1 -1 1 1 1
1 -1 1 0 0
-1 1 0 -1 1
1 -1 0 0 0
-1 1 -1 -1 -1
1 1 0 1 -1
0 1 -1 1 -1
-1 0 1 1 0
1 -1 0 -1 -1
Output:
>> out
out =
1 1 1 1 1
1 1 2 2 2
2 2 3 0 0
1 1 0 1 1
1 1 0 0 0
1 1 1 1 1
1 2 0 1 2
0 3 1 2 3
1 0 1 3 0
1 1 0 1 1
Building upon rayryeng's answer, below's my take on a loop-based solution.
Inputs:
hypwayt = [
-1 1 1 1 1
1 -1 1 1 1
1 -1 1 0 0
-1 1 0 -1 1
1 -1 0 0 0
-1 1 -1 -1 -1
1 1 0 1 -1
0 1 -1 1 -1
-1 0 1 1 0
1 -1 0 -1 -1 ];
expected_out = [
1 1 1 1 1
1 1 2 2 2
2 2 3 0 0
1 1 0 1 1
1 1 0 0 0
1 1 1 1 1
1 2 0 1 2
0 3 1 2 3
1 0 1 3 0
1 1 0 1 1 ];
Actual code:
CNT_INIT = 2; %// a constant representing an initialized counter
out = hypwayt; %// "preallocation"
out(2:end,:) = diff(out); %// ...we'll deal with the top row later
hyp_nnz = hypwayt~=0; %// nonzero mask for later brevity
cnt = CNT_INIT; %// first initialization of the counter
for ind1 = 2:numel(out)
switch abs(out(ind1))
case 2 %// switch from -1 to 1 and vice versa:
out(ind1) = 1;
cnt = CNT_INIT;
case 0 %// means we have the same number again:
out(ind1) = cnt*hyp_nnz(ind1); %//put cnt unless we're zero
cnt = cnt+1;
case 1 %// means we transitioned to/from zero:
out(ind1) = hyp_nnz(ind1); %// was it a nonzero element?
cnt = CNT_INIT;
end
end
%// Finally, take care of the top row:
out(1,:) = hyp_nnz(1,:);
Correctness test:
assert(isequal(out,expected_out))
I guess it may be simplified further by using some "complex" MATLAB functions, but IMHO it does seem elegant enough :)
Note: the top row of out
is computed twice (once in the loop and once at the end), so there is a tiny inefficiency associated with computing values twice. However, it allows to put the entire logic into a single loop operating on numel()
, which in my opinion justifies this tiny bit of extra computations.
That's a nice problem, and since @rayryeng has not proposed a vectorized solution, here is mine in a few lines -- ok, it's not fair, it took me half a day to end up with this one. The basic idea is to use cumsum
as the final function.
p = size(hypwayt,2); % keep nb of columns in mind
% H1 is the mask of consecutive identical values, but kept as an array of double (it will be incremented later)
H1 = [zeros(1,p);diff(hypwayt)==0];
% H2 is the mask of elements where a consecutive sequence of identical values ends. Note the first line of trues.
H2 = [true(1,p);diff(~H1)>0];
% 1st trick: compute the vectorized cumsum of H1
H3 = cumsum(H1(:));
% 2nd trick: take the diff of H3(H2).
% it results in a vector of the lengths of consecutive sequences of identical values, interleaved with some zeros.
% substract it to H1 at the same locations
H1(H2) = H1(H2)-[0;diff(H3(H2))];
% H1 is ready to be cumsummed! Add one to the array, all lengths are decreased by one.
Output = cumsum(H1)+1;
% last force input zeros to be zero
Output(hypwayt==0) = 0;
And the expected output:
Output =
1 1 1 1 1
1 1 2 2 2
2 2 3 0 0
1 1 0 1 1
1 1 0 0 0
1 1 1 1 1
1 2 0 1 2
0 3 1 2 3
1 0 1 3 0
1 1 0 1 1
Let me add some explanations. The big trick is of course the second one, it took me a while to figure out how to compute the lengths of consecutive identical values fast. The first one is just a little trick to compute the whole thing without any for-loop. If you cumsum H1
directly, you get the result with some offsets. These offsets are removed in a cumsum-compliant manner, by taking the local difference of some key values and removing them just after the ends of these sequences. These special values are out-numbered, I take also the first row (first line of H2
): each first column element is seen as different from the last element of the previous column.
I hope it's a bit more clear now (and there is no flaw with some special case ...).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With