Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matlab: Kmeans gives different results each time

I running kmeans in matlab on a 400x1000 matrix and for some reason whenever I run the algorithm I get different results. Below is a code example:

[idx, ~, ~, ~] = kmeans(factor_matrix, 10, 'dist','sqeuclidean','replicates',20);

For some reason, each time I run this code I get different results? any ideas?

I am using it to identify multicollinearity issues.

Thanks for the help!

like image 747
user1129988 Avatar asked Aug 27 '12 06:08

user1129988


1 Answers

The k-means implementation in MATLAB has a randomized component: the selection of initial centers. This causes different outcomes. Practically however, MATLAB runs k-means a number of times and returns you the clustering with the lowest distortion. If you're seeing wildly different clusterings each time, it may mean that your data is not amenable to the kind of clusters (spherical) that k-means looks for, and is an indication toward trying other clustering algorithms (e.g. spectral ones).

You can get deterministic behavior by passing it an initial set of centers as one of the function arguments (the start parameter). This will give you the same output clustering each time. There are several heuristics to choose the initial set of centers (e.g. K-means++).

like image 74
Ansari Avatar answered Oct 15 '22 04:10

Ansari