Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spectral clustering

First off I must say that I'm new to matlab (and to this site...) , so please excuse my ignorance.

I'm trying to write a function in matlab that will use Spectral Clustering to split a set of points into two clusters.

my code is as follows

function Groups = TrySpectralClustering(data)
dist_mat = squareform(pdist(data));

W=  zeros(length(data),length(data));

for i=1:length(data),
    for j=(i+1):length(data),
    W(i,j)=10^(-dist_mat(i,j));
    W(j,i)=W(i,j);
    end
end
D = zeros(length(data),length(data));
for i=1:length(W),
D(i,i)=sum(W(i,:));
end
L=D-W;
L=D^(-0.5)*L*D^(-0.5);
[ V E ] = eig(L);
disp ('V:');
disp (V);

If I understand correctly, then by using the second smallest eigenvector I should be able to perform a partition of the data into two clusters - If the ith member of the 2nd eigenvector is positive, the ith data point would be in the one cluster, otherwise it would be in the other cluster.

However, when I try the following

f=[1,1;0,0;1,0;0,1;100,100;100,101;101,101;101,100]
TrySpectralClustering(f)

I would expect that the first four points would form one cluster, and the last four would form another.

However, I receive

V:
   -0.0000   -0.5000    0.0000   -0.5777    0.0000    0.4078   -0.0000    0.5000
   -0.0000   -0.5000    0.0000    0.5777    0.0000   -0.4078   -0.0000    0.5000
   -0.0000   -0.5000    0.0000    0.4078    0.0000    0.5777   -0.0000   -0.5000
   -0.0000   -0.5000    0.0000   -0.4078    0.0000   -0.5777   -0.0000   -0.5000
   -0.5000   -0.0000   -0.0000   -0.0000   -0.7071   -0.0000    0.5000   -0.0000
   -0.5000   -0.0000    0.7071    0.0000   -0.0000   -0.0000   -0.5000   -0.0000
   -0.5000    0.0000   -0.0000    0.0000    0.7071    0.0000    0.5000    0.0000
   -0.5000         0   -0.7071         0         0         0   -0.5000         0

Taking the 2nd eigenvector

  -0.0000   -0.5000    0.0000    0.5777    0.0000   -0.4078   -0.0000    0.5000

I find the one cluster includes the points 1,0;0,1;100,100;101,100 and the other cluster is made from the points 1,1;0,0;100,101;101,101

I wonder what am I doing wrong.

Note: I am working on the above as a part of a homework project.

Thanks in advance!

like image 688
user2182857 Avatar asked Mar 18 '13 15:03

user2182857


2 Answers

What you are getting is correct. Let U be the matrix containing the eigenvectors as shown above and let them be arranged such that the 1st column corresponds to the smallest eigenvalue and progressive columns correspond to the ascending eigenvalues. Then, take a subset of columns of U by retaining the eigenvectors corresponding to the smaller eigenvalues. Now, read these columns row-wise into a new set of vectors, call it Y. Cluster Y to get the spectral clusters. So, let us assume our subset is only the first column. We clearly see that if u were to cluster the first column, u would get the first 4 into 1 cluster and the next 4 into another cluster, which is what you want.

like image 89
Spectre Avatar answered Oct 19 '22 08:10

Spectre


Take a look at the implementation on Prof. J. Shi's webpage. Pay close attention to discretisation.m function.

Moreover, your code is very inefficient. You need to take more advantage of Matlab's vectorization:

W = 10.^( - dist_mat ); % single liner of nested loop for comuting W
% computing the symmetric laplacian
d = sum( W, 2 ); % sum each row
d( d == 0 ) = 1; % avoid division by zero
d_half = 1./sqrt( d );
L = eye( n ) - bsxfun( @times, bsxfun( @times, W, d_half' ),  d_half );
like image 20
Shai Avatar answered Oct 19 '22 10:10

Shai