Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pairwise Distances Between Two "islands"/"connected components" in Numpy Array

Consider the following image, stored as a numpy array:

a = [[0,0,0,0,0,1,1,0,0,0],
     [0,0,0,0,1,1,1,1,0,0],
     [0,0,0,0,0,1,1,0,0,0],
     [0,0,0,0,0,0,0,0,0,0],
     [0,0,0,0,0,2,0,0,0,0],
     [0,0,0,0,0,2,2,0,0,0],
     [0,0,0,0,0,2,0,0,0,0],
     [0,0,0,0,3,3,3,0,0,0],
     [4,0,0,0,0,0,0,0,0,0],
     [4,4,0,0,0,0,0,0,0,0],
     [4,4,4,0,0,0,0,0,0,0]]

a = np.array(a)

Zeros represent background pixels, 1,2,3 and 4 represent pixels that belong to objects. You can see that objects always form contiguous islands or regions in the image. I would like to know the distance between every pair of objects. As distance measure I'd like to have the shortest, staightline distance, between those pixels of the object, that are closest to each other. Example: Distance(2,3) = 1, because they are touching. Distance(1,2) = 2, because there is exactly one background pixel separating the two regions, or in other words, the closest pixels of the objects are two pixels apart.

Can anybody tell me how one would approach this problem in Python? Or link me to some resources?

like image 671
r0f1 Avatar asked Jul 09 '20 08:07

r0f1


2 Answers

For many blobs or bigger blobs or if performance/memory efficiency is a criteria, you might want to work with contours of those islands. With that in mind, we will use OpenCV's findContours to get the contours, then perform pairwise distance computation and get the min one as the final output. The implementation would look something like this that gets all possible pairiwise distances -

from scipy.spatial.distance import cdist
import cv2

ids = np.arange(1, a.max()+1) #np.unique(a)[1:] if not in ranged sequence

idxs = []
for id_ in ids:
    im = (a == id_).astype(np.uint8)
    contours,_ = cv2.findContours(im, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    idx = contours[0][:, 0]
    idxs.append(idx)

# Get pairwise indices and then distances
r,c = np.triu_indices(len(ids), 1)
pdists = {(ids[i],ids[j]):cdist(idxs[i], idxs[j]).min() for (i, j) in zip(r, c)}

Output dict for given sample -

In [225]: pdists
Out[225]: 
{(1, 2): 2.0,
 (1, 3): 5.0,
 (1, 4): 7.810249675906654,
 (2, 3): 1.0,
 (2, 4): 5.0,
 (3, 4): 3.605551275463989}

By default,cdist uses euclidean distance as the metric. Depending on your definition of straight line between islands, you might want to try out other metrics, namely 'minkowski' and 'cityblock' for Minkowski and Manhattan distances respectively.

So, cdist(idxs[i], idxs[j]) would change to cdist(idxs[i], idxs[j], metric=...).

like image 178
Divakar Avatar answered Oct 07 '22 16:10

Divakar


This is what you would need:

from scipy.spatial.distance import cdist
def Distance(a, m, n):
  return cdist(np.argwhere(a==m),np.argwhere(a==n),'minkowski',p=1.).min()

or similarly per @MaxPowers comment (claim: cityblock is faster):

  return cdist(np.argwhere(a==m),np.argwhere(a==n),'cityblock').min()

Find the locations of islands and calculate pairwise distance of locations and get the minimum. I am not 100% sure of your desired distance, but I think you are looking for l1 norm. If not, you can change the cdist measure to your desired metric.

output:

Distance(a,2,3)
1.0
Distance(a,2,1)
2.0
Distance(a,3,1)
5.0
Distance(a,4,3)
5.0
like image 39
Ehsan Avatar answered Oct 07 '22 17:10

Ehsan