Faster way to take xor between one row and a matrix of 1's and 0's?

Question

I have a row of data, say A = [0 1 1 1 0 0].

Matrix B contains many rows. For a dummy example let's say it's just B = [1 1 1 0 1 0; 1 0 0 1 0 1].

I want to find the number of columns in which A and a row of B differ, and use that vector of differences to find which row of B is most similar to A. So for the example above, A differs from B(1,:) in columns 1, 4, 5 = 3 total difference. A differs from B(2,:) in columns 1, 2, 3, 6 = 4 total differences, and so I would want to return index 1 to indicate that A is most similar to B(1,:).

In reality B has ~50,000 rows, and A and B both have about 800 columns. My current code to find the most similar row is below:

min(sum(xor(repmat(A,B_rows,1),B),2));

That works, but it's very slow. Any insights into which function is taking so long and ways to improve it?

Robert Seifert · Accepted Answer

There are 6 more or less similar approaches with bsxfun, hard to tell which one is the fastest.

%example data
A = [0 1 1 1 0 0];
B = perms(A);     %720x6 matrix

Counting similarities:

Z = sum(  bsxfun(@eq,  A,B) , 2);
Z = sum( ~bsxfun(@ne,  A,B) , 2);
Z = sum( ~bsxfun(@xor, A,B) , 2);

Counting differences:

Z = sum( ~bsxfun(@eq,  A,B) , 2);
Z = sum(  bsxfun(@ne,  A,B) , 2);
Z = sum(  bsxfun(@xor, A,B) , 2);

Z is a vector containing the number of equal/unequal elements of A for every row of B.

Benchmark for 100 trials each (same order as above):

Shai · Answer

Use pdist2 with 'hamming' distance

[D I] = pdist2( A, B, 'hamming', 'Smallest', 1 );

Faster way to take xor between one row and a matrix of 1's and 0's?

Tags:

arrays

matlab

user1956609

2 Answers

Robert Seifert

Shai

Recent Activity

Donate For Us

Faster way to take xor between one row and a matrix of 1's and 0's?

Tags:

arrays

matlab

user1956609

2 Answers

Robert Seifert

Shai

Related questions

Recent Activity

Donate For Us