Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use a precomputed distance matrix in Scikit KMeans?

I'm new to scikit.
I can't find an example using a precomputed distance matrix in Scikit KMeans.
Could anybody shed a light now this, better with an example?

like image 690
LeonL. Avatar asked Jul 03 '14 18:07

LeonL.


People also ask

Does sklearn K-Means use Euclidean distance?

Sklearn Kmeans uses the Euclidean distance. It has no metric parameter.

What distance metric does sklearn K-Means use?

K-Means uses euclidean distance, as the default distance metric, for clustering.

Does K-Means use a distance matrix?

K-means, as the name indicates, uses means. Computing the arithmetic mean requires access to the original features, a distance matrix cannot be used. K-means also does not use pairwise distances. So the distance matrix is useless for this algorithm.

Which function is used to create distance matrix in clustering?

Hey, to my knowledge, the R function hclust is able to generate clustering from a distance matrix as input such as the matrix produced by the dist function in R.


1 Answers

Scikit-learn does not allow you to pass in a custom (precomputed) distance matrix. It can precompute Euclidean distance matrix to speed-up the process, but there's no way to use your own one without hacking the source.

like image 151
Artem Sobolev Avatar answered Jan 02 '23 12:01

Artem Sobolev