I have an algorithm that is running on a set of objects. This algorithm produces a score value that dictates the differences between the elements in the set. The sorted output is something like this: [1,1,5,6,1,5,10,22,23,23,50,51,51,52,100,112,130,500,512,600,12000,12230] If you lay these values down on a spreadsheet you see that they make up groups [1,1,5,6,1,5] [10,22,23,23] [50,51,51,52] [100,112,130] [500,512,600] [12000,12230] Is there a way to programatically get those groupings? Maybe some clustering algorithm using a machine learning library? Or am I overthinking this? I've looked at scikit but their examples are way too advanced for my problem...

<h3>Don't use clustering for 1-dimensional data</h3> Clustering algorithms are designed for multivariate data. When you have 1-dimensional data, sort it, and look for the largest gaps. This is trivial and fast in 1d, and not possible in 2d. If you want something more advanced, use Kernel Density Estimation (KDE) and look for local minima to split the data set. There are a number of duplicates of this question: <ul> <li>1D Number Array Clustering</li> <li>Cluster one-dimensional data optimally?</li> </ul>

Clustering values by their proximity in python (machine learning?) [duplicate]

1 Answers

Don't use clustering for 1-dimensional data

Clustering algorithms are designed for multivariate data. When you have 1-dimensional data, sort it, and look for the largest gaps. This is trivial and fast in 1d, and not possible in 2d. If you want something more advanced, use Kernel Density Estimation (KDE) and look for local minima to split the data set.

There are a number of duplicates of this question:

1D Number Array Clustering
Cluster one-dimensional data optimally?

106

answered Sep 24 '22 03:09

Has QUIT--Anony-Mousse

Related questions
                            
                                Hide the src image in an <img> element, but show its background image
                            
                                Slice string into letters
                            
                                Pushing data to Google spreadsheet through JavaScript running in browser
                            
                                C# Parsing JSON array of objects
                            
                                composer to disable https completely
                            
                                jQuery AJAX calls in for loop [duplicate]
                            
                                C# - Asserting two objects are equal in unit tests
                            
                                Android Gradle Running Tests on Non-Debug Builds
                            
                                Python Django : No module named security
                            
                                FullCalendar TypeError: $(...).fullCalendar is not a function
                            
                                Adding Swift files to test target not fixing unit tests
                            
                                Rounding Pandas Timestamp to minutes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Clustering values by their proximity in python (machine learning?) [duplicate]

Tags:

PCoelho

People also ask

1 Answers

Don't use clustering for 1-dimensional data

Has QUIT--Anony-Mousse

Recent Activity

Donate For Us