Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cluster one-dimensional data optimally? [closed]

Does anyone have a paper that explains how the Ckmeans.1d.dp algorithm works?

Or: what is the most optimal way to do k-means clustering in one-dimension?

like image 988
Laciel Avatar asked Oct 23 '11 22:10

Laciel


People also ask

Why is it difficult to cluster data in higher dimensional spaces?

On the one hand, it is notoriously difficult to define a distance between data points in high-dimensional scRNAseq space due to the Curse of Dimensionality; one the other hand, clustering algorithms often use idealistic assumptions which do not hold for the real world data.

What are the limitations of cluster analysis?

Limitations of Cluster AnalysisThe different methods of clustering usually give very different results. This occurs because of the different criterion for merging clusters (including cases). It is important to think carefully about which method is best for what you are interested in looking at.

Can you cluster high-dimensional data?

A cluster is defined and characterized based on the attributes present in the cluster. Clustering High-Dimensional Data we need to search for clusters and find out the space for the existing clusters. The High-Dimensional data is reduced to low-dimension data to make the clustering and search for clusters simple.


1 Answers

Univariate k-means clustering can be solved in O(kn) time (on already sorted input) based on theoretical results on Monge matrices, but the approach was not popular most likely due to numerical instability and also perhaps coding challenges.

A better option is an O(knlgn) method that is now implemented in Ckmeans.1d.dp version 3.4.6. This implementation is as fast as heuristic k-means but offers guaranteed optimality, orders of magnitude better than heuristic k-means especially for large k's.

The generic dynamic programming solution by Richard Bellman (1973) does not touch upon specifics of the k-means problem and the implied runtime is O(kn^3).

like image 170
user6417312 Avatar answered Oct 04 '22 12:10

user6417312