Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sample given points stochastically in a 3D space with minimum nearest-neighbor distance and maximum density

I have n points in a 3D space. I want to stochastically sample a subset of points with all nearest-neighbor distances larger than r. The size of the subset m is unknown, but I want the sampled points to be as dense as possible, i.e. maximize m.

There are similar questions, but they are all about generating points, rather than sampling from given points.
Generate random points in 3D space with minimum nearest-neighbor distance

Generate 3-d random points with minimum distance between each of them?

Say I have 300 random 3D points,

import numpy as np
n = 300
points = np.random.uniform(0, 10, size=(n, 3))

What is the fastest way to get a subset of m points with minimum nearest-neighbor distance r = 1 while maximizing m?

like image 989
Shaun Han Avatar asked Oct 26 '22 15:10

Shaun Han


2 Answers

There's probably an efficient bicriteria approximation scheme, but why bother when integer programming is so quick on average?

import numpy as np

n = 300
points = np.random.uniform(0, 10, size=(n, 3))

from ortools.linear_solver import pywraplp

solver = pywraplp.Solver.CreateSolver("SCIP")
variables = [solver.BoolVar("x[{}]".format(i)) for i in range(n)]
solver.Maximize(sum(variables))
for j, q in enumerate(points):
    for i, p in enumerate(points[:j]):
        if np.linalg.norm(p - q) <= 1:
            solver.Add(variables[i] + variables[j] <= 1)
solver.EnableOutput()
solver.Solve()
print(len([i for (i, variable) in enumerate(variables) if variable.SolutionValue()]))
like image 154
David Eisenstat Avatar answered Oct 29 '22 22:10

David Eisenstat


This isn't optimally large of a subset, but should be close while not taking very long, using KDTree to optimize the distance calculations:

from scipy.spatial import KDTree
import numpy as np

def space_sample(n = 300, low = 0, high = 10, dist = 1):
    points = np.random.uniform(low, high, size=(n, 3))
    k = KDTree(points)
    pairs = np.array(list(k.query_pairs(dist)))
    
    def reduce_pairs(pairs, remove = []):  #iteratively remove the most connected node
        p = pairs[~np.isin(pairs, remove).any(1)]
        if p.size >0:
            count = np.bincount(p.flatten(), minlength = n)
            r = remove + [count.argmax()]
            return reduce_pairs(p, r)
        else:
            return remove
    
    return np.array([p for i, p in enumerate(points) if not(i in reduce_pairs(pairs))])

subset = space_sample()

Iteratively removing the most connected node is not optimal (keeps about 200 of the 300 points), but is likely the fastest algorithm that's close to optimal (the only thing faster being just removing at random). You could possibly @njit reduce_pairs to make that part faster (will try if I have time later).

like image 35
Daniel F Avatar answered Oct 30 '22 00:10

Daniel F