I have a cell in MATLAB where each element contains a vector of a different length
e.g.
C = {[1 2 3], [2 4 5 6], [1 2 3], [6 4], [7 6 4 3], [4 6], [6 4]}
As you can see, some of the the vectors are repeated, others are unique.
I want to count the number of times each vector occurs and return the count such that I can populate a table in a GUI where each row is a unique combination and the date shows how many times each combination occurs.
e.g.
Count
"[1 2 3]" 2
"[6 4]" 2
"[2 4 5 6]" 1
"[7 6 4 3]" 1
"[4 6]" 1
I should say that the order of the numbers in each vector is important i.e. [6 4] is not the same as [4 6].
Any thoughts how I can do this fairly efficiently?
Thanks to people who have commented so far. As @Divakar kindly pointed out, I forgot to mention that the values in the vector can be more than one digit long. i.e. [46, 36 28]
. My original code would concatenate the vector [1 2 3 4]
into 1234
then use hist to do the counting. Of course this falls apart when you got above single digits as you can tell the difference between [1, 2, 3, 4]
and [12, 34]
.
You can convert all the entries to char and then to a 2D numeric array and finally use unique(...'rows')
to get labels for unique rows and use them to get their counts.
C = {[46, 36 28], [2 4 5 6], [46, 36 28], [6 4], [7 6 4 3], [4 6], [6 4]} %// Input
char_array1 = char(C{:})-0; %// convert input cell array to a char array
[~,unqlabels,entry_labels] = unique(char_array1,'rows'); %// get unique rows
count = histc(entry_labels,1:max(entry_labels)); %// counts of each unique row
For the purpose of presenting the output in a format as asked in the question, you can use this -
out = [C(unqlabels)' num2cell(count)];
Output -
out =
[1x4 double] [1]
[1x2 double] [1]
[1x2 double] [2]
[1x4 double] [1]
[1x3 double] [2]
and display the unique rows with celldisp
-
ans{1} =
2 4 5 6
ans{2} =
4 6
ans{3} =
6 4
ans{4} =
7 6 4 3
ans{5} =
46 36 28
Edit: If you have negative numbers in there, you need to do little more work to setup char_array1
as shown here and rest of the code stays the same -
lens = cellfun(@numel,C);
mat1(max(lens),numel(lens))=0;
mat1(bsxfun(@ge,lens,[1:max(lens)]')) = horzcat(C{:});
char_array1 = mat1';
A way I can think of is to convert to strings and then use unique
Cs = cellfun(@(x)(mat2str(x)),C,'uniformoutput',false);
[Cu,idx_u,idx] = unique(Cs);
now you can count the number of occurrences with idx
, for instance using
fv=tabulate(idx)
so fv
, has already all the info you need, but for purposes of display I'll add:
[Cu' , num2cell(fv(:,2))]
ans =
'[1 2 3]' [2]
'[2 4 5 6]' [1]
'[4 6]' [1]
'[6 4]' [2]
'[7 6 4 3]' [1]
Another suggestion I can think of is to convert each array into a concatenation of numbers, then do a histogram to count how many values you have per entry. We would need to figure out how many unique numbers we have first, which would serve as the histogram edges through unique
.
One thing I will need to note is that we are assuming that each element in your array for each cell is a single digit. This obviously won't work if there are numbers that are two digits or more.
In other words:
%// Convert each array of numbers into a single number
numbers = cellfun(@(x) sum(x.*10.^(numel(x)-1:-1:0)), C);
%// Find unique numbers
uniNumbers = unique(numbers);
%// Get histogram
out = histc(numbers, uniNumbers);
%// Display counts
disp([uniNumbers; out]);
out
would contain the counts per unique number in your cell
array. We get:
46 64 123 2456 7643
1 2 2 1 1
The trick with the first line of code is that I'm using the decomposition of numbers in base 10 where each digit can be uniquely represented as a sum of multiples of powers of 10. As such, 4587 can be represented as:
4000 + 500 + 80 + 7 ==> 4*10^3 + 5*10^2 + 8*10^1 + 7*10^0
I took each number in our array, and used those as coefficients for each decreasing power of 10, then summed them all together. As such, in your cell arrays, [1 2 3]
, is converted to 123
, and so on. With your example, this is the output of numbers
, which is doing what I talked about above:
numbers =
Columns 1 through 6
123 2456 123 64 7643 46
Column 7
64
Compare this with your actual cell array in C
:
celldisp(C)
C{1} =
1 2 3
C{2} =
2 4 5 6
C{3} =
1 2 3
C{4} =
6 4
C{5} =
7 6 4 3
C{6} =
4 6
C{7} =
6 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With