Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Implementation of OPTICS (Clustering) Algorithm

I'm looking for a decent implementation of the OPTICS algorithm in Python. I will use it to form density-based clusters of points ((x,y) pairs).

I'm looking for something that takes in (x,y) pairs and outputs a list of clusters, where each cluster in the list contains a list of (x, y) pairs belonging to that cluster.

like image 821
Murat Derya Özen Avatar asked Apr 01 '11 15:04

Murat Derya Özen


2 Answers

I'm not aware of a complete and exact python implementation of OPTICS. The links posted here seem just rough approximations of the OPTICS idea. They also do not use an index for acceleration, so they will run in O(n^2) or more likely even O(n^3).

OPTICS has a number of tricky things besides the obvious idea. In particular, the thresholding is proposed to be done with relative thresholds ("xi") instead of absolute thresholds as posted here (at which point the result will be approximately that of DBSCAN!).

The original OPTICS paper contains a suggested approach to converting the algorithm's output into actual clusters:

http://www.dbs.informatik.uni-muenchen.de/Publikationen/Papers/OPTICS.pdf

The OPTICS implementation in Weka is essentially unmaintained and just as incomplete. It doesn't actually produce clusters, it only computes the cluster order. For this it makes a duplicate of the database - it isn't really Weka code.

There seems to be a rather extensive implementation available in ELKI in Java by the group that published OPTICS in the first place. You might want to test any other implementation against this "official" version.

like image 178
Has QUIT--Anony-Mousse Avatar answered Sep 19 '22 22:09

Has QUIT--Anony-Mousse


EDIT: the following is known to not be a complete implementation of OPTICS.

I did a quick search and found the following (Optics). I can't vouch for its quality, however the algorithm seems pretty simple, so you should be able to validate/adapt it quickly.

Here is a quick example of how to build clusters on the output of the optics algorithm:

def cluster(order, distance, points, threshold):     ''' Given the output of the options algorithm,     compute the clusters:      @param order The order of the points     @param distance The relative distances of the points     @param points The actual points     @param threshold The threshold value to cluster on     @returns A list of cluster groups     '''     clusters = [[]]     points   = sorted(zip(order, distance, points))     splits   = ((v > threshold, p) for i,v,p in points)     for iscluster, point in splits:          if iscluster: clusters[-1].append(point)         elif len(clusters[-1]) > 0: clusters.append([])     return clusters      rd, cd, order = optics(points, 4)     print cluster(order, rd, points, 38.0) 
like image 21
Bashwork Avatar answered Sep 22 '22 22:09

Bashwork