Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MATLAB: comparison of cell arrays of string

I have two cell arrays of strings, and I want to check if they contain the same strings (they do not have to be in the same order, nor do we know if they are of the same lengths).

For example:

a = {'2' '4' '1' '3'};
b = {'1' '2' '4' '3'};

or

a = {'2' '4' '1' '3' '5'};
b = {'1' '2' '4' '3'};

First I thought of strcmp but it would require looping over one cell contents and compare against the other. I also considered ismember by using something like:

ismember(a,b) & ismember(b,a)

but then we don't know in advance that they are of the same length (obvious case of unequal). So how would you perform this comparison in the most efficient way without writing too many cases of if/else.

like image 732
Dave Avatar asked Jul 12 '10 19:07

Dave


People also ask

How do I compare two string arrays in MATLAB?

Compare string arrays using strcmp . You can compare and sort string arrays with relational operators, just as you can with numeric arrays. Use == to determine which elements of two string arrays are equal.

How do you compare character arrays in MATLAB?

You can compare character vectors and cell arrays of character vectors to each other. Use the strcmp function to compare two character vectors, or strncmp to compare the first N characters. You also can use strcmpi and strncmpi for case-insensitive comparisons. Compare two character vectors with the strcmp function.

How do you compare two strings in an array?

Arrays class provides two convenient methods for array comparison – equals() and deepEquals() . We can use either method for string array comparison.


1 Answers

You could use the function SETXOR, which will return the values that are not in the intersection of the two cell arrays. If it returns an empty array, then the two cell arrays contain the same values:

arraysAreEqual = isempty(setxor(a,b));



EDIT: Some performance measures...

Since you were curious about performance measures, I thought I'd test the speed of my solution against the two solutions listed by Amro (which use ISMEMBER and STRCMP/CELLFUN). I first created two large cell arrays:

a = cellstr(num2str((1:10000).'));  %'# A cell array with 10,000 strings
b = cellstr(num2str((1:10001).'));  %'# A cell array with 10,001 strings

Next, I ran each solution 100 times over to get a mean execution time. Then, I swapped a and b and reran it. Here are the results:

    Method     |      Time     |  a and b swapped
---------------+---------------+------------------
Using SETXOR   |   0.0549 sec  |    0.0578 sec
Using ISMEMBER |   0.0856 sec  |    0.0426 sec
Using STRCMP   |       too long to bother ;)

Notice that the SETXOR solution has consistently fast timing. The ISMEMBER solution will actually run slightly faster if a has elements that are not in b. This is due to the short-circuit && which skips the second half of the calculation (because we already know a and b do not contain the same values). However, if all of the values in a are also in b, the ISMEMBER solution is significantly slower.

like image 147
gnovice Avatar answered Sep 19 '22 21:09

gnovice