I have a cell array like
A = {'hello'; 2; 3; 4; 'hello';2;'hello'}
I would like to find if there are repetitions in this array and identify their names and indexes. In this example I would like to have something like:
names = {'hello';2};
indexes = [1, 5, 7;
2, 6, 0];
I've put the last element of the second row of index to 0 just to not have problems in the dimensions...My problem is that the cell array is both char and double...I don't know how to deal with this...
It's messy, but it can be done:
m = max(cellfun(@length, A));
A2 = cellfun(@(e) [double(e) inf(1,m-length(e)) ischar(e)], A, 'uni' ,false);
A2 = cell2mat(A2);
[~, ~, jj] = unique(A2,'rows');
num = accumarray(jj,1,[],@numel);
[~, kk] = max(bsxfun(@eq, jj, find(num>1).'));
names = A(kk);
indices = arrayfun(@(n) find(jj==jj(kk(n))), 1:length(kk), 'uni', false);
How this works: A2
is just A
converted to a matrix of numbers. Each row represents one entry of A
, with the last column used as a flag to distinguish original numbers from original strings, and inf
used as a filler. Then the usual couple of unique
and accumarray
do the actual job, and the results are obtained from jj
and num
with some comparisons and indexing.
Because you are using a structure that contains both strings and numbers things are not quite as easy. Assuming you can not change this at all, the best way to find unique values and their indices is to just loop through the specified cell array, and save its contents to a map object that will store the indices that those unique entries exist.
This is pretty simple with MATLAB's map structure and can follow the code below.
A = {'hello'; 2; 3; 4; 'hello';2;'hello'}
cellMap = containers.Map();
for i = 1 : numel(A)
mapKey = num2str(A{i});
if cellMap.isKey(mapKey)
tempCell = cellMap(mapKey);
tempCell{numel(tempCell)+1} = i;
cellMap(mapKey) = tempCell;
else
tempCell = cell(1);
tempCell{1} = i;
cellMap(mapKey) = tempCell;
end
end
With this you can find all unique values by typing cellMap.keys
, which will return
ans =
'2' '3' '4' 'hello'
And then you can use these keys to find out where they occurred in the original array using cellMap('hello')
.
ans =
[1] [5] [7]
Once you have all of this, you can do a little bit of conversion to get back to the original state and get things more into a format that you want.
uniqueVals = cellMap.keys;
uniqueIndices = cell(1,numel(uniqueVals));
for i = 1:numel(uniqueVals)
uniqueIndices{i} = cell2mat(cellMap(uniqueVals{i}));
numEquiv = str2double(uniqueVals{i});
if ~isnan(numEquiv)
uniqueVals{i} = numEquiv;
end
end
uniqueVals{4}
uniqueIndices{4}
which will return:
ans =
hello
ans =
1 5 7
Another option, and probably much simpler and straight forward, is to just make a copy of your cell array, and convert all of its contents to a string format. This will not really return things in the format you want immediately, but it is a start
B = cell(size(A));
for i = 1:numel(A)
B{i} = num2str(A{i});
end
[C,~,IC] = unique(B)
You can then use the returns from unique
to find indices, but honestly that is all done already with the mapping code I wrote above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With