Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting and displaying sum of occurrences

Tags:

count

matlab

Part of my data (cell array of strings) is shown below. I want to count the occurrences of particular strings (e.g., 'P0702', 'P0882', etc.) and display the sum of the occurrences in the form of the output shown below:

'1FA'   '2012'  'F' ''  ''  ''  ''  ''  'P0702' 'P0882' 
'1Fc'   '2012'  'r' ''  ''  ''  ''  ''  'P0702' ''  ''  ''  
'1FA'   '2012'  'f' ''  ''  ''  ''  ''  'P0702' 'P0882' ''  
'1FA'   '2012'  'y' ''  ''  ''  'P0702' ''  ''  ''  ''  ''  
'1FA'   '2012'  'g' ''  ''  ''  ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'u' ''  'P0702' 'P0882' ''  ''  ''  ''  ''  
'1FA'   '2012'  'y' ''  'P0702' ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'n' ''  'P0702' ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'j' ''  ''  ''  ''  ''  ''  ''  ''  'P0702'                                
'1FA'   '2012'  'u' 'P0702' ''  ''  ''  ''  ''  ''  ''  ''  
'1FM'   '2013'  'x' ''  ''  ''  ''  ''  'P1921' ''  ''  ''
'1FM'   '2013'  'c' ''  'P1711' ''  ''  ''  ''  ''  ''  ''
'1FM'   '2013'  'c' ''  ''  ''  ''  ''  'P0702' 'P0882' ''
'1FM'   '2009'  'E' ''  ''  ''  ''  ''  ''  ''  'P0500' 

Output:

        sum of counts above      
P0702   15
P0500    1
P1711    1

and so on.

I tried using sum(strcmp(d,{'P0882'}),2); which tells me how many times 'P0882' occurs, but it would be difficult to use it for every data string.

like image 915
calib_san Avatar asked Dec 30 '25 23:12

calib_san


2 Answers

You could do as follows, basically apply strcmp as you proposed but in a loop in which you pre-determined the unique strings/data names to count.

I modified a bit the data you provided so that dimensions fit. The code is commented and pretty easy to follow:

C = {'1FA'   '2012'  'F' ''  ''  ''  ''  ''  'P0702' 'P0882' ;
'1Fc'   '2012'  'r' ''  ''  ''  ''  ''  'P0702' '';
'1FA'   '2012'  'f' ''  ''  ''  ''  ''  'P0702' 'P0882';
'1FA'   '2012'  'y' ''  ''  ''  'P0702' ''  ''  '';
'1FA'   '2012'  'g' ''  ''  ''  ''  ''  ''  '';
'1FA'   '2012'  'u' ''  'P0702' 'P0882' ''  ''  ''  ''  ;
'1FA'   '2012'  'y' ''  'P0702' ''  ''  ''  ''  '' ;
'1FA'   '2012'  'n' ''  'P0702' ''  ''  ''  ''  '' ;
'1FA'   '2012'  'j' ''  ''  ''  ''  ''  ''  'P0702' ;  
'1FA'   '2012'  'u' 'P0702' ''  ''  ''  ''  '' '' ;
'1FM'   '2013'  'x' ''  ''  ''  ''  ''  'P1921' '';
'1FM'   '2013'  'c' ''  'P1711' ''  ''  ''  ''  '';
'1FM'   '2013'  'c' ''  ''  ''  ''  ''  'P0702' 'P0882';
'1FM'   '2009'  'E' ''  ''  ''  ''  ''  '' 'P0500'}

%// Find unique strings to count occurence of.
[strings,~,~] = unique(C(:,4:end));

%// Remove empty cells automatically.
strings = strings(~cellfun(@isempty,strings));

%// Initialize output cell array
Output = cell(numel(strings),2);

%// Count occurence. You can combine the 2 lines into one using concatenation.
for k = 1:numel(strings)

    Output{k,1} = strings{k};    
    Output{k,2} = sum(sum(strcmp(C(:,4:end),strings{k})));

end

Let's make a nice table out of this:

T = table(Output(:,2),'RowNames',Output(:,1),'VariableNames',{'TotalOccurences'})

Output:

T = 

             TotalOccurences
             _______________

    P0500    [ 1]           
    P0702    [10]           
    P0882    [ 4]           
    P1711    [ 1]           
    P1921    [ 1]

If you don't have access to the table function, you can create a cell array with headers and change a bit the loop:

%// Initialize output cell array
Output = cell(numel(strings)+1,2);

%// Count occurence
for k = 1:numel(strings)

    Output{k+1,1} = strings{k};    
    Output{k+1,2} = sum(sum(strcmp(C(:,4:end),strings{k})));

end
%T = table(Output(:,2),'RowNames',Output(:,1),'VariableNames',{'TotalOccurences'})

Output(1,:) = {'Data' 'Occurence'}

Output:

Output = 

    'Data'     'Occurence'
    'P0500'    [        1]
    'P0702'    [       10]
    'P0882'    [        4]
    'P1711'    [        1]
    'P1921'    [        1]
like image 113
Benoit_11 Avatar answered Jan 02 '26 17:01

Benoit_11


If you have the Statistics Toolbox you can simply use tabulate

%// get only relevant part
X = data(:,4:end);

%// tabulate
tabulate(X(:))

It already gives a nicely formatted output:

  Value    Count   Percent
  P0702       10     58.82%
  P1711        1      5.88%
  P0882        4     23.53%
  P1921        1      5.88%
  P0500        1      5.88%

Alternatively with standard functions:

X = data(:,4:end)
[a,~,x] = unique(X(~strcmp(X,'')))
occ = hist(x(:),1:numel(a))
out = [a num2cell(occ).']
like image 25
Robert Seifert Avatar answered Jan 02 '26 17:01

Robert Seifert



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!