I have a list of (x,y) points. I'm trying to produce a plot of the distance to each point as an image, so my naive function looks like:
from scipy.spatial.distance import cdist
from numpy import *
def findDistances(imageSize, points):
image = zeros(imageSize)
for x in range(0,imageSize[1]):
for y in range(0,imageSize[0]):
image[y,x] = np.min(cdist(array([[x,y]]), points))
return image
This function is fine, it gives what I want (see picture below). This takes about 100s for a ~1MP image which is fine for something that only ever needs to be done once, but I assume there's room for improvement. I also tried:
image[y,x] = min(linalg.norm(array([[x,y]])- points, axis=1))
Which runs in a comparable time - makes sense, they're probably doing something similar under the hood, though I haven't checked the source to be sure.
I had a look at Scipy's cKDTree, with:
from scipy.spatial import cKDTree
def findDistancesTree(imageSize, points):
tree = cKDTree(points)
image = zeros(imageSize)
for x in range(0,imageSize[1]):
for y in range(0,imageSize[0]):
image[y,x] = tree.query([x,y])[0]
return image
A call to tree.query takes around 50µs (from %timeit) and in reality it takes 70-80s to generate a 1MP distance map. 20s improvement is better than a kick in the groin, but I don't know if I can improve it further.
%timeit np.min(linalg.norm(array([[random.random()*1000,random.random()*1000]]) - points, axis=1))
%timeit np.min(cdist(array([[random.random()*1000,random.random()*1000]]), points))
%timeit tree.query(array([[random.random()*1000,random.random()*1000]]))
10000 loops, best of 3: 82.8 µs per loop
10000 loops, best of 3: 67.9 µs per loop
10000 loops, best of 3: 52.3 µs per loop
In terms of complexity, brute force should be something like O(NM) where N is the number of pixels in the image and M is the number of points to check against. I was expecting more of a speedup as the search time should be more like O(N log(M)) for N pixels with a log(M) look-up time for each one - am I missing something?

This sounded like a problem where even a basic brute force implementation with a GPU would give good improvement. So i gave it a shot. And the improvement was quite good.
I did my testing using pyopencl.
import pyopencl as cl
import numpy as np
def findDistances_cl(imageSize, points):
#Boilerplate opencl code
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
f = open('nn.cl', 'r')
fstr = "".join(f.readlines())
program = cl.Program(ctx, fstr).build()
#Creating buffers for the opencl kernel
mf = cl.mem_flags
img = np.empty(imageSize, dtype=np.float32)
x_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.USE_HOST_PTR, hostbuf=points[:,0].astype(np.float32))
y_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.USE_HOST_PTR, hostbuf=points[:,1].astype(np.float32))
n_points = cl.Buffer(ctx, mf.READ_ONLY | mf.USE_HOST_PTR, hostbuf=np.array([len(points)],dtype=np.int))
img_buf = cl.Buffer(ctx, mf.WRITE_ONLY, img.nbytes)
#Run the kernel
exec_evt = program.nn(queue, img.shape, None, img_buf, x_buf, y_buf, n_points)
exec_evt.wait()
#read back the result
cl.enqueue_read_buffer(queue, img_buf, img).wait()
return img
the opencl kernel (nn.cl)
__kernel void nn(__global float *output, __global constant float *x , __global constant float *y, __global constant int *numPoints)
{
int row = get_global_id(0);
int col = get_global_id(1);
int numRows = get_global_size(0);
int numCols = get_global_size(1);
int gid = col+ row*numCols;
float minDist = numRows * numCols;
for(int i = 0; i < *numPoints; i++){
minDist = min(minDist, sqrt((row - y[i])*(row - y[i]) + (col - x[i])*(col - x[i])));
}
output[gid] = minDist;
}
Timing results.
imageSize = [1000, 1000]
points = np.random.random((1000,2))*imageSize[0]
In [4]: %timeit findDistancesTree(imageSize, points)
1 loops, best of 3: 27.1 s per loop
In [7]: %timeit findDistances_cl(imageSize, points)
10 loops, best of 3: 55.3 ms per loop
About 490x speed improvement. If you need more speed there are more advanced algorithms out there for doing nearest neighbors with GPUs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With