I have a matrix (A) in the form of (much larger in reality):
205 204 201
202 208 202
How can I tally the co-incidence of numbers on a column-by-column basis and then output this to a matrix?
I'd want the final matrix to run from min(A):max(A) (or be able to specify a specific range) across the top and down the side and for it to tally co-incidences of numbers in each column. Using the above example:
200 201 202 203 204 205 206 207 208
200 0 0 0 0 0 0 0 0 0
201 0 0 1 0 0 0 0 0 0
202 0 0 0 0 0 1 0 0 0
203 0 0 0 0 0 0 0 0 0
204 0 0 0 0 0 0 0 0 1
205 0 0 0 0 0 0 0 0 0
206 0 0 0 0 0 0 0 0 0
207 0 0 0 0 0 0 0 0 0
208 0 0 0 0 0 0 0 0 0
(Matrix labels are not required)
Two important points: The tallying needs to be non-duplicating and occur in numerical order. For example a column containing:
205
202
Will tally this as a 202 occurring with 205 (as shown in the above matrix) but NOT 205 with 202 - the duplicate reciprocal. When deciding what number to use as the reference, it should be the smallest.
EDIT:
sparse
to the rescue!
Let your data and desired range be defined as
A = [ 205 204 201
202 208 202 ]; %// data. Two-row matrix
limits = [200 208]; %// desired range. It needn't include all values of A
Then
lim1 = limits(1)-1;
s = limits(2)-lim1;
cols = all((A>=limits(1)) & (A<=limits(2)), 1);
B = sort(A(:,cols), 1, 'descend')-lim1;
R = full(sparse(B(2,:), B(1,:), 1, s, s));
gives
R =
0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
Alternatively, you can dispense with sort
and use matrix addition followed by triu
to obtain the same result (possibly faster):
lim1 = limits(1)-1;
s = limits(2)-lim1;
cols = all( (A>=limits(1)) & (A<=limits(2)) , 1);
R = full(sparse(A(2,cols)-lim1, A(1,cols)-lim1, 1, s, s));
R = triu(R + R.');
Both approaches handle repeated columns (up to sorting), correctly increasing their tally. For example,
A = [205 204 201
201 208 205]
gives
R =
0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
See if this is what you were after -
range1 = 200:208 %// Set the range
A = A(:,all(A>=min(range1)) & all(A<=max(range1))) %// select A with columns
%// that fall within range1
A_off = A-range1(1)+1 %// Get the offsetted indices from A
A_off_sort = sort(A_off,1) %// sort offset indices to satisfy "smallest" criteria
out = zeros(numel(range1)); %// storage for output matrix
idx = sub2ind(size(out),A_off_sort(1,:),A_off_sort(2,:)) %// get the indices to be set
unqidx = unique(idx)
out(unqidx) = histc(idx,unqidx) %// set coincidences
With
A = [205 204 201
201 208 205]
this gets -
out =
0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
Few performance-oriented tricks could be used here -
I. Replace
out = zeros(numel(range1));
with
out(numel(range1),numel(range1)) = 0;
II. Replace
idx = sub2ind(size(out),A_off_sort(1,:),A_off_sort(2,:))
with
idx = (A_off_sort(2,:)-1)*numel(range1)+A_off_sort(1,:)
What about a solution using accumarray
? I would first sort each column independently, then use the first row as first dimension into the final accumulation matrix, then the second row as the second dimension into the final accumulation matrix. Something like:
limits = 200:208;
A = A(:,all(A>=min(limits)) & all(A<=max(limits))); %// Borrowed from Divakar
%// Sort the columns individually and bring down to 1-indexing
B = sort(A, 1) - limits(1) + 1;
%// Create co-occurrence matrix
C = accumarray(B.', 1, [numel(limits) numel(limits)]);
With:
A = [205 204 201
202 208 202]
This is the output:
C =
0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
With duplicates (borrowed from Luis Mendo):
A = [205 204 201
201 208 205]
Output:
C =
0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With