Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorize mask of squared euclidean distance in Python

I'm running code to generate a mask of locations in B closer than some distance D to locations in A.

N = [[0 for j in range(length_B)] for i in range(length_A)]    
dSquared = D*D

for i in range(length_A):
    for j in range(length_B):
        if ((A[j][0]-B[i][0])**2 + (A[j][1]-B[i][1])**2) <= dSquared:
            N[i][j] = 1

For lists of A and B that are tens of thousands of locations long, this code takes a while. I'm pretty sure there's a way to vectorize this though to make it run much faster. Thank you.

like image 577
ToneDaBass Avatar asked Apr 14 '16 16:04

ToneDaBass


1 Answers

It's easier to visualize this code with 2d array indexing:

for j in range(length_A):
    for i in range(length_B):
        dist = (A[j,0] - B[i,0])**2 + (A[j,1] - B[i,1])**2 
        if dist <= dSquared:
            N[i, j] = 1

That dist expression looks like

((A[j,:] - B[i,:])**2).sum(axis=1)

With 2 elements this array expression might not be faster, but it should help us rethink the problem.

We can perform the i,j, outter problems with broadcasting

A[:,None,:] - B[None,:,:]  # 3d difference array

dist=((A[:,None,:] - B[None,:,:])**2).sum(axis=-1)  # (lengthA,lengthB) array

Compare this to dSquared, and use the resulting boolean array as a mask for setting elements of N to 1:

N = np.zeros((lengthA,lengthB))
N[dist <= dSquared] = 1

I haven't tested this code, so there may be bugs, but I think basic idea is there. And may be enough of the thought process to let you work out the details for other cases.

like image 116
hpaulj Avatar answered Sep 23 '22 10:09

hpaulj