Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Method for calculating irregularly spaced accumulation points

I am attempting to do the opposite of this: Given a 2D image of (continuous) intensities, generate a set of irregularly spaced accumulation points, i.e, points that irregularly cover the 2D map, being closer to each other at the areas of high intensities (but without overlap!).

My first try was "weighted" k-means. As I didn't find a working implementation of weighted k-means, the way I introduce the weights consists of repeating the points with high intensities. Here is my code:

import numpy as np
from sklearn.cluster import KMeans

def accumulation_points_finder(x, y, data, n_points, method, cut_value):
    #computing the rms
    rms = estimate_rms(data)
    #structuring the data
    X,Y = np.meshgrid(x, y, sparse=False)
    if cut_value > 0.:
        mask = data > cut_value
        #applying the mask
        X = X[mask]; Y = Y[mask]; data = data[mask]
        _data = np.array([X, Y, data])
    else:
        X = X.ravel(); Y = Y.ravel(); data = data.ravel()
        _data = np.array([X, Y, data])

    if method=='weighted_kmeans':
        res = []
        for i in range(len(data)):
            w = int(ceil(data[i]/rms))
            res.extend([[X[i],Y[i]]]*w)
        res = np.asarray(res)
        #kmeans object instantiation
        kmeans = KMeans(init='k-means++', n_clusters=n_points, n_init=25, n_jobs=2)
        #performing kmeans clustering
        kmeans.fit(res)
        #returning just (x,y) positions
        return kmeans.cluster_centers_

Here are two different results: 1) Making use of all the data pixels. 2) Making use of only pixels above some threshold (RMS).

Without threshold

With threshold

As you can see the points seems to be more regularly spaced than concentrated at areas of high intensities.

So my question is if there exist a (deterministic if possible) better method for computing such accumulation points.

like image 771
mavillan Avatar asked Aug 09 '16 02:08

mavillan


1 Answers

Partition the data using quadtrees (https://en.wikipedia.org/wiki/Quadtree) into units of equal variance (or maybe also possible to make use of the concentration value?), using a defined threhold, then keep one point per unit (the centroid). There will be more subdivisions in areas with rapidly changing values, fewer in the background areas.

like image 191
Benjamin Avatar answered Sep 30 '22 14:09

Benjamin