I have a raster of ecological habitats which I've converted into a two-dimensional Python numpy array (example_array below). I also have an array containing "seed" regions with unique values (seed_array below) which I'd like to use to classify my habitat regions. I'd like to 'grow' my seed regions 'into' my habitat regions such that habitats are assigned the ID of the nearest seed region, as measured 'through' the habitat regions. For example:
My best approach used the ndimage.distance_transform_edt
function to create an array depicting the nearest "seed" region to each cell in the dataset, which was then substituted back into the habitat array. This doesn't work particularly well, however, as the function doesn't measure distances "through" my habitat regions, for example below where the red circle represents an incorrectly classified cell:
Below are sample arrays for my habitat and seed data, and an example of the kind of output I'm looking for. My actual datasets are much larger - over a million habitat/seed regions. Any help would be much appreciated!
import numpy as np
import scipy.ndimage as ndimage
import matplotlib.pyplot as plt
# Sample study area array
example_array = np.array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1],
[1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot example array
plt.imshow(example_array, cmap="spectral", interpolation='nearest')
seed_array = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 0, 2, 2, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot seeds
plt.imshow(seed_array, cmap="spectral", interpolation='nearest')
desired_output = np.array([[0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 4, 4, 4, 0, 0, 0, 3, 3, 3],
[0, 0, 0, 0, 4, 4, 0, 0, 0, 3, 3, 3],
[0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 3, 3],
[1, 1, 0, 1, 0, 0, 0, 0, 2, 2, 3, 3],
[1, 1, 1, 1, 0, 0, 2, 2, 2, 0, 0, 3],
[1, 1, 1, 1, 1, 2, 2, 2, 2, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 2, 2, 2, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot desired output
plt.imshow(desired_output, cmap="spectral", interpolation='nearest')
You can use watershed segmentation from scikits-image:
Distance transform
from scipy import ndimage as nd
distance = nd.distance_transform_edt(example_array)
Watershed segmentation
from skimage.morphology import watershed, square
result = watershed(-distance, seed_array, mask=example_array, \
connectivity=square(3))
Result
subplot(1,2,1)
imshow(-distance, 'spectral', interpolation='none')
subplot(1,2,2)
imshow(result, 'spectral', interpolation='none')
As another variant, and following your initial approach, you can use watershed to find connected neighbours to nearest seeds. As you mentioned in the question:
Calculate distance to the seeds:
distance = nd.distance_transform_edt(seed_array == 0)
Calculate watershed in the distance space:
result = watershed(distance, seed_array, mask=example_array, \
connectivity=square(3))
Plot result:
figure(figsize=(9,3))
subplot(1,3,1)
imshow(distance, 'jet', interpolation='none')
subplot(1,3,2)
imshow(np.ma.masked_where(example_array==0, distance), 'jet', interpolation='none')
subplot(1,3,3)
imshow(result, 'spectral', interpolation='none')
Further discussion: Watershed method tries to grow regions from seeded peaks by flowing through the image gradient. As your image is binary, the regions will expand equally in all directions from the seeded points, and thus give you the point in between two regions. For more info about watershed refer to wikipedia.
In the first example, the distance transform is calculated in the original image, and thus the regions expand equally from seeds until they achieve the splitting point in the middle.
In the second example, the distance transform is calculated from all the pixels to any of the seeded points, and then applying watershed in that space. Watershed basically will assign each pixel to its nearest seed, but it will add a connectivity constrain.
NOTE the sign difference in the distance maps in both plotting and watersed.
NOTE In distance maps (left image in both plots), blue means close where red means far.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With