Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Classifying Python array by nearest "seed" region?

I have a raster of ecological habitats which I've converted into a two-dimensional Python numpy array (example_array below). I also have an array containing "seed" regions with unique values (seed_array below) which I'd like to use to classify my habitat regions. I'd like to 'grow' my seed regions 'into' my habitat regions such that habitats are assigned the ID of the nearest seed region, as measured 'through' the habitat regions. For example:

Image of arrays

My best approach used the ndimage.distance_transform_edt function to create an array depicting the nearest "seed" region to each cell in the dataset, which was then substituted back into the habitat array. This doesn't work particularly well, however, as the function doesn't measure distances "through" my habitat regions, for example below where the red circle represents an incorrectly classified cell:

Incorrect output using ndimage

Below are sample arrays for my habitat and seed data, and an example of the kind of output I'm looking for. My actual datasets are much larger - over a million habitat/seed regions. Any help would be much appreciated!

import numpy as np
import scipy.ndimage as ndimage
import matplotlib.pyplot as plt

# Sample study area array
example_array = np.array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
                          [0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1],
                          [0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1],
                          [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0],
                          [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1],
                          [1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1],
                          [1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1],
                          [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
                          [1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0],
                          [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                          [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                          [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

# Plot example array
plt.imshow(example_array, cmap="spectral", interpolation='nearest')

seed_array = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 1, 1, 1, 0, 0, 2, 2, 0, 0, 0, 0],
                       [0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

# Plot seeds
plt.imshow(seed_array, cmap="spectral", interpolation='nearest')

desired_output = np.array([[0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
                           [0, 0, 0, 4, 4, 4, 0, 0, 0, 3, 3, 3],
                           [0, 0, 0, 0, 4, 4, 0, 0, 0, 3, 3, 3],
                           [0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0],
                           [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 3, 3],
                           [1, 1, 0, 1, 0, 0, 0, 0, 2, 2, 3, 3],
                           [1, 1, 1, 1, 0, 0, 2, 2, 2, 0, 0, 3],
                           [1, 1, 1, 1, 1, 2, 2, 2, 2, 0, 0, 0],
                           [1, 1, 1, 1, 0, 0, 2, 2, 2, 0, 0, 0],
                           [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                           [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

# Plot desired output
plt.imshow(desired_output, cmap="spectral", interpolation='nearest')
like image 348
Robbi Bishop-Taylor Avatar asked Aug 06 '15 06:08

Robbi Bishop-Taylor


1 Answers

You can use watershed segmentation from scikits-image:

  • Distance transform

    from scipy import ndimage as nd
    distance = nd.distance_transform_edt(example_array)
    
  • Watershed segmentation

    from skimage.morphology import watershed, square
    result = watershed(-distance, seed_array, mask=example_array, \
                       connectivity=square(3))
    
  • Result

    subplot(1,2,1)
    imshow(-distance, 'spectral', interpolation='none')
    subplot(1,2,2)
    imshow(result, 'spectral', interpolation='none')
    

enter image description here


As another variant, and following your initial approach, you can use watershed to find connected neighbours to nearest seeds. As you mentioned in the question:

  • Calculate distance to the seeds:

    distance = nd.distance_transform_edt(seed_array == 0)
    
  • Calculate watershed in the distance space:

    result = watershed(distance, seed_array, mask=example_array, \
                       connectivity=square(3))
    
  • Plot result:

    figure(figsize=(9,3))
    subplot(1,3,1)
    imshow(distance, 'jet', interpolation='none')
    subplot(1,3,2)
    imshow(np.ma.masked_where(example_array==0, distance), 'jet', interpolation='none')
    subplot(1,3,3)
    imshow(result, 'spectral', interpolation='none')
    

enter image description here


Further discussion: Watershed method tries to grow regions from seeded peaks by flowing through the image gradient. As your image is binary, the regions will expand equally in all directions from the seeded points, and thus give you the point in between two regions. For more info about watershed refer to wikipedia.

In the first example, the distance transform is calculated in the original image, and thus the regions expand equally from seeds until they achieve the splitting point in the middle.

In the second example, the distance transform is calculated from all the pixels to any of the seeded points, and then applying watershed in that space. Watershed basically will assign each pixel to its nearest seed, but it will add a connectivity constrain.

NOTE the sign difference in the distance maps in both plotting and watersed.

NOTE In distance maps (left image in both plots), blue means close where red means far.

like image 153
Imanol Luengo Avatar answered Oct 28 '22 07:10

Imanol Luengo