I have a raster with a set of unique ID patches/regions which I've converted into a two-dimensional Python numpy array. I would like to calculate pairwise Euclidean distances between all regions to obtain the minimum distance separating the nearest edges of each raster patch. As the array was originally a raster, a solution needs to account for diagonal distances across cells (I can always convert any distances measured in cells back to metres by multiplying by the raster resolution).
I've experimented with the cdist
function from scipy.spatial.distance
as suggested in this answer to a related question, but so far I've been unable to solve my problem using the available documentation. As an end result I would ideally have a 3 by X array in the form of "from ID, to ID, distance", including distances between all possible combinations of regions.
Here's a sample dataset resembling my input data:
import numpy as np
import matplotlib.pyplot as plt
# Sample study area array
example_array = np.array([[0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 0, 2, 2, 0, 6, 0, 3, 3, 3],
[0, 0, 0, 0, 2, 2, 0, 0, 0, 3, 3, 3],
[0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3],
[1, 1, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3],
[1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 3],
[1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 5, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4]])
# Plot array
plt.imshow(example_array, cmap="spectral", interpolation='nearest')
In this method, we first initialize two numpy arrays. Then, we take the difference of the two arrays, compute the dot product of the result, and transpose of the result. Then we take the square root of the answer. This is another way to implement Euclidean distance.
Python – Pairwise distances of n-dimensional space array. scipy.stats.pdist (array, axis=0) function calculates the Pairwise distances between observations in n-dimensional space. axis: Axis along which to be computed. By default axis = 0. Returns : Pairwise distances of the array elements based on the set parameters.
Calculating distance between two geo-locations in Python. Step 1: Installing “haversine”. Step 2: Importing library After installing the library import it import haversine as hs. Step 3: Calculating distance between two locations loc1= (28.426846,77.088834) loc2= (28.394231,77.050308) hs.haversine ...
Euclidean Distance is a distance between two points in space that can be measured with the help of the Pythagorean formula. The formula is shown below: square root of [ (x-a)^2 + (y-b)^2 + (z-c)^2 ]. To calculate the Euclidean Distance between two coordinate points we will be making use of the numpy module in python.
The Manhattan distance between two vectors/arrays (say A and B) , is calculated as Σ|A i – B i | where A i is the ith element in the first array and B i is the ith element in the second array. The output of the code mentioned above is displayed below.
Distances between labeled regions of an image can be calculated with the following code,
import itertools
from scipy.spatial.distance import cdist
# making sure that IDs are integer
example_array = np.asarray(example_array, dtype=np.int)
# we assume that IDs start from 1, so we have n-1 unique IDs between 1 and n
n = example_array.max()
indexes = []
for k in range(1, n):
tmp = np.nonzero(example_array == k)
tmp = np.asarray(tmp).T
indexes.append(tmp)
# calculating the distance matrix
distance_matrix = np.zeros((n-1, n-1), dtype=np.float)
for i, j in itertools.combinations(range(n-1), 2):
# use squared Euclidean distance (more efficient), and take the square root only of the single element we are interested in.
d2 = cdist(indexes[i], indexes[j], metric='sqeuclidean')
distance_matrix[i, j] = distance_matrix[j, i] = d2.min()**0.5
# mapping the distance matrix to labeled IDs (could be improved/extended)
labels_i, labels_j = np.meshgrid( range(1, n), range(1, n))
results = np.dstack((labels_i, labels_j, distance_matrix)).reshape((-1, 3))
print(distance_matrix)
print(results)
This assumes integer IDs, and would need to be extended if that is not the case. For instance, with the test data above, the calculated distance matrix is,
# From 1 2 3 4 5 # To
[[ 0. 4.12310563 4. 9.05538514 5. ] # 1
[ 4.12310563 0. 3.16227766 10.81665383 8.24621125] # 2
[ 4. 3.16227766 0. 4.24264069 2. ] # 3
[ 9.05538514 10.81665383 4.24264069 0. 3.16227766] # 4
[ 5. 8.24621125 2. 3.16227766 0. ]] # 5
while the full output can be found here. Note that this takes the Eucledian distance from the center of each pixel. For instance, the distance between zones 1 and 3 is 2.0, while they are separated by 1 pixel.
This is a brute-force approach, where we calculate all the pairwise distances between pixels of different regions. This should be sufficient for most applications. Still, if you need better performance, have a look at scipy.spatial.cKDTree
which would be more efficient in computing the minimum distance between two regions, when compared to cdist
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With