Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort list of floating-point numbers in groups

Tags:

python

numpy

I have an array of floating-point numbers, which is unordered. I know that the values always fall around a few points, which are not known. For illustration, this list

[10.01,5.001,4.89,5.1,9.9,10.1,5.05,4.99]

has values clustered around 5 and 10, so I would like [5,10] as answer.

I would like to find those clusters for lists with 1000+ values, where the nunber of clusters is probably around 10 (for some given tolerance). How to do that efficiently?

like image 231
eudoxos Avatar asked Nov 22 '11 12:11

eudoxos


1 Answers

Check python-cluster. With this library you could do something like this :

from cluster import *

data = [10.01,5.001,4.89,5.1,9.9,10.1,5.05,4.99]
cl = HierarchicalClustering(data, lambda x,y: abs(x-y))
print [mean(cluster) for cluster in cl.getlevel(1.0)]

And you would get:

[5.0062, 10.003333333333332]

(This is a very silly example, because I don't really know what you want to do, and because this is the first time I've used this library)

like image 105
Fábio Diniz Avatar answered Sep 17 '22 14:09

Fábio Diniz