Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get clusters from distance matrix, using PHP?

I have distance matrix as two-dimensional array, like this:

distance matrix

So, I need to find clusters, of elements with its help. I can do it, using hierarchic clusterization, like k-means. I have found such example here PHP K-Means

How can I convert my two-dimensional array into array of points, listed in this example?

$points = [
[80,55],[86,59],[19,85],[41,47],[57,58],
[76,22],[94,60],[13,93],[90,48],[52,54],
[62,46],[88,44],[85,24],[63,14],[51,40],
[75,31],[86,62],[81,95],[47,22],[43,95],
[71,19],[17,65],[69,21],[59,60],[59,12],
[15,22],[49,93],[56,35],[18,20],[39,59],
[50,15],[81,36],[67,62],[32,15],[75,65],
[10,47],[75,18],[13,45],[30,62],[95,79],
[64,11],[92,14],[94,49],[39,13],[60,68],
[62,10],[74,44],[37,42],[97,60],[47,73],
];
like image 828
Bogdan Lashkov Avatar asked Nov 08 '22 13:11

Bogdan Lashkov


1 Answers

First: a nitpick: k-Means is not a hierarchical clustering algorithm, see https://www.quora.com/What-is-the-difference-between-k-means-and-hierarchical-clustering for details o the difference.

Second: you don't want to convert a distance matrix back to the points it originated from as you take a step back. Sadly the k-Means implementation you linked only has an API that allows you to enter raw coordinates and assumes Euclidean distance, therefore you have some possibilities, depending on your requirements:

  1. Where do you get the distance matrix from? If it is possible, get the raw coordinates (and make sure the distance measure is euclidean distance) and use the library you linked.

  2. Override the Point class in the library you linked: specifically the getDistanceWith method to return values from your matrix

  3. If you only need to calculate the cluster once, use python and sklearn. This library does exactly what you want. Especially: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.cluster.hierarchy.linkage.html

  4. Write your own code: clustering is quite an easy topic and therefore it is a nice coding exercise.

like image 112
chuck258 Avatar answered Nov 15 '22 06:11

chuck258