Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

find repetition cell array

Tags:

cell

matlab

I have a cell array like

 A = {'hello'; 2; 3; 4; 'hello';2;'hello'}

I would like to find if there are repetitions in this array and identify their names and indexes. In this example I would like to have something like:

names = {'hello';2};
indexes = [1, 5, 7;
         2, 6, 0];

I've put the last element of the second row of index to 0 just to not have problems in the dimensions...My problem is that the cell array is both char and double...I don't know how to deal with this...

like image 998
gabboshow Avatar asked Nov 21 '13 15:11

gabboshow


2 Answers

It's messy, but it can be done:

m = max(cellfun(@length, A));
A2 = cellfun(@(e) [double(e) inf(1,m-length(e)) ischar(e)], A, 'uni' ,false);
A2 = cell2mat(A2);
[~, ~, jj] = unique(A2,'rows');
num = accumarray(jj,1,[],@numel);
[~, kk] = max(bsxfun(@eq, jj, find(num>1).'));
names = A(kk);
indices = arrayfun(@(n) find(jj==jj(kk(n))), 1:length(kk), 'uni', false);

How this works: A2 is just A converted to a matrix of numbers. Each row represents one entry of A, with the last column used as a flag to distinguish original numbers from original strings, and inf used as a filler. Then the usual couple of unique and accumarray do the actual job, and the results are obtained from jj and num with some comparisons and indexing.

like image 87
Luis Mendo Avatar answered Oct 03 '22 14:10

Luis Mendo


Because you are using a structure that contains both strings and numbers things are not quite as easy. Assuming you can not change this at all, the best way to find unique values and their indices is to just loop through the specified cell array, and save its contents to a map object that will store the indices that those unique entries exist.

This is pretty simple with MATLAB's map structure and can follow the code below.

A = {'hello'; 2; 3; 4; 'hello';2;'hello'}

cellMap = containers.Map();
for i = 1 : numel(A)
    mapKey = num2str(A{i});
    if cellMap.isKey(mapKey)
       tempCell = cellMap(mapKey);
       tempCell{numel(tempCell)+1} = i;
       cellMap(mapKey) = tempCell;
    else
        tempCell = cell(1);
        tempCell{1} = i;
        cellMap(mapKey) = tempCell;
    end
end

With this you can find all unique values by typing cellMap.keys, which will return

ans = 
    '2'    '3'    '4'    'hello'

And then you can use these keys to find out where they occurred in the original array using cellMap('hello').

ans = 
    [1]    [5]    [7]

Once you have all of this, you can do a little bit of conversion to get back to the original state and get things more into a format that you want.

uniqueVals = cellMap.keys;
uniqueIndices = cell(1,numel(uniqueVals));
for i = 1:numel(uniqueVals)
    uniqueIndices{i} = cell2mat(cellMap(uniqueVals{i}));
      numEquiv = str2double(uniqueVals{i});
      if ~isnan(numEquiv)
          uniqueVals{i} = numEquiv;
      end
end
uniqueVals{4}
uniqueIndices{4}

which will return:

ans =
    hello
ans = 
    1     5     7

Another option, and probably much simpler and straight forward, is to just make a copy of your cell array, and convert all of its contents to a string format. This will not really return things in the format you want immediately, but it is a start

B = cell(size(A));
for i = 1:numel(A)
    B{i} = num2str(A{i});
end
[C,~,IC] = unique(B)

You can then use the returns from unique to find indices, but honestly that is all done already with the mapping code I wrote above.

like image 34
MZimmerman6 Avatar answered Oct 03 '22 15:10

MZimmerman6