Calculating distances between unique Python array regions?

Tags:

I have a raster with a set of unique ID patches/regions which I've converted into a two-dimensional Python numpy array. I would like to calculate pairwise Euclidean distances between all regions to obtain the minimum distance separating the nearest edges of each raster patch. As the array was originally a raster, a solution needs to account for diagonal distances across cells (I can always convert any distances measured in cells back to metres by multiplying by the raster resolution).

I've experimented with the cdist function from scipy.spatial.distance as suggested in this answer to a related question, but so far I've been unable to solve my problem using the available documentation. As an end result I would ideally have a 3 by X array in the form of "from ID, to ID, distance", including distances between all possible combinations of regions.

Here's a sample dataset resembling my input data:

import numpy as np
import matplotlib.pyplot as plt

# Sample study area array
example_array = np.array([[0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0],
                          [0, 0, 2, 0, 2, 2, 0, 6, 0, 3, 3, 3],
                          [0, 0, 0, 0, 2, 2, 0, 0, 0, 3, 3, 3],
                          [0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0],
                          [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3],
                          [1, 1, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3],
                          [1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 3],
                          [1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 0],
                          [1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 0],
                          [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                          [1, 0, 1, 0, 0, 0, 0, 5, 5, 0, 0, 0],
                          [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4]])

# Plot array
plt.imshow(example_array, cmap="spectral", interpolation='nearest')

Example array with numbered regions

839

asked Jun 16 '15 01:06

Robbi Bishop-Taylor

1 Answers

Distances between labeled regions of an image can be calculated with the following code,

import itertools
from scipy.spatial.distance import cdist

# making sure that IDs are integer
example_array = np.asarray(example_array, dtype=np.int) 
# we assume that IDs start from 1, so we have n-1 unique IDs between 1 and n
n = example_array.max()

indexes = []
for k in range(1, n):
    tmp = np.nonzero(example_array == k)
    tmp = np.asarray(tmp).T
    indexes.append(tmp)

# calculating the distance matrix
distance_matrix = np.zeros((n-1, n-1), dtype=np.float)   
for i, j in itertools.combinations(range(n-1), 2):
    # use squared Euclidean distance (more efficient), and take the square root only of the single element we are interested in.
    d2 = cdist(indexes[i], indexes[j], metric='sqeuclidean') 
    distance_matrix[i, j] = distance_matrix[j, i] = d2.min()**0.5

# mapping the distance matrix to labeled IDs (could be improved/extended)
labels_i, labels_j = np.meshgrid( range(1, n), range(1, n))  
results = np.dstack((labels_i, labels_j, distance_matrix)).reshape((-1, 3))

print(distance_matrix)
print(results)

This assumes integer IDs, and would need to be extended if that is not the case. For instance, with the test data above, the calculated distance matrix is,

# From  1             2         3            4              5         # To
[[  0.           4.12310563   4.           9.05538514   5.        ]   # 1
 [  4.12310563   0.           3.16227766  10.81665383   8.24621125]   # 2
 [  4.           3.16227766   0.           4.24264069   2.        ]   # 3 
 [  9.05538514  10.81665383   4.24264069   0.           3.16227766]   # 4
 [  5.           8.24621125   2.           3.16227766   0.        ]]  # 5

while the full output can be found here. Note that this takes the Eucledian distance from the center of each pixel. For instance, the distance between zones 1 and 3 is 2.0, while they are separated by 1 pixel.

This is a brute-force approach, where we calculate all the pairwise distances between pixels of different regions. This should be sufficient for most applications. Still, if you need better performance, have a look at scipy.spatial.cKDTree which would be more efficient in computing the minimum distance between two regions, when compared to cdist.

answered Oct 19 '22 18:10

rth

Related questions
                            
                                Cannot get logging work for Flask with gunicorn daemon mode
                            
                                Adding Per request Context to Logging in Python
                            
                                Does django csrf token must be unique on every request?
                            
                                Using pyKML to parse KML Document
                            
                                Django REST Framework: SlugRelatedField for indirectly-related attribute?
                            
                                Why there's the difference between creating class in python 2.7 and python 3.4 performance
                            
                                Python on android [duplicate]
                            
                                Storing dates with more-than-4-digits years
                            
                                what is the equivalent of a (python-)module in UML
                            
                                Getting certificate verify failed error with mechanize
                            
                                Pre-processing before digit recognition for NN & CNN trained with MNIST dataset
                            
                                How do I connect to a kerberos authenticated REST service in Python on Windows
                            
                                How to parameterize python unittest setUp method?
                            
                                Python UDP socket send bottleneck (slow/delays randomly)
                            
                                Prettify Jinja2 Template
                            
                                Different versions of sklearn give quite different training results
                            
                                Celery: Rate limit on tasks with the same parameters
                            
                                wxPython wx.lib.plot.PlotCanvas error
                            
                                Python string format a float, how to truncate instead of rounding
                            
                                Creating an Exe with Selenium Module: Py2exe/Pyinstaller

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Calculating distances between unique Python array regions?

Tags:

python

arrays

numpy

scipy

distance

Robbi Bishop-Taylor

People also ask

1 Answers

rth

Recent Activity

Donate For Us