Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to find clusters (min x pts. within y distance of cluster center) of geographical points

Given a database of geographical locations (long/lat), what would be the best approach to determining/detecting clusters of locations that are within x miles of the cluster center AND total at least y locations?

e.g. out of 1000 McWidgets in NC, there are 30 clusters each containing 20 or more stores within 7 miles of their respective cluster center.

It's been a long time since my applied math course in college... any help for an old mushy brain would be greatly appreciated.

like image 968
etriad Avatar asked Oct 17 '11 16:10

etriad


People also ask

What are the distance based clustering algorithms?

Distance based methods optimize a global criteria based on the distance between the patterns. k-means, CLARA, CLARANS are examples of dis- tance based clustering method. Density based methods optimize local criteria based on density information of the patterns.

How do you find the center of a cluster of points?

Divide the total by the number of members of the cluster. In the example above, 283 divided by four is 70.75, and 213 divided by four is 53.25, so the centroid of the cluster is (70.75, 53.25).

In which algorithm Each cluster is represented by the Centre of gravity of the cluster?

The k-means nearest-neighbor (k-means) algorithm belongs to the partitioning clustering methods in which the mean value of the objects or observation in a cluster is used as a center of the cluster, which is also regarded as the center of gravity for a cluster.

How does the DBSCAN algorithm find the clusters?

The algorithm proceeds by arbitrarily picking up a point in the dataset (until all points have been visited). If there are at least 'minPoint' points within a radius of 'ε' to the point then we consider all these points to be part of the same cluster.


2 Answers

A common method for this kind of problem is Density-based Spatial Clustering of Applications with Noise (DBSCAN). A variation which may be a better choice, if you can't determine a good density parameter, is the Ordering Points To Identify the Clustering Structure (OPTICS) algorithm, which uses a distance parameter, rather than a density parameter.

like image 104
andand Avatar answered Oct 04 '22 17:10

andand


You probably need one of the Clustering algorithms.

like image 31
Tomas Avatar answered Oct 04 '22 17:10

Tomas