Fast way to turn a labeled image into a dictionary of { label : [coordinates] }

Question

Say I've labeled an image with scipy.ndimage.measurements.label like so:

[[0, 1, 0, 0, 0, 0],
 [0, 1, 0, 0, 0, 0],
 [0, 1, 0, 0, 0, 0],
 [0, 0, 0, 0, 3, 0],
 [2, 2, 0, 0, 0, 0],
 [2, 2, 0, 0, 0, 0]]

What's a fast way to collect the coordinates belonging to each label? I.e. something like:

{ 1: [[0, 1], [1, 1], [2, 1]],
  2: [[4, 0], [4, 1], [5, 0], [5, 1]],
  3: [[3, 4]] }

I'm working with images that are ~15,000 x 5000 pixels in size, and roughly half of each image's pixels are labeled (i.e. non-zero).

Rather than iterating through the entire image with nditer, would it be faster to do something like np.where(img == label) for each label?

EDIT:

Which algorithm is fastest depends on how big the labeled image is as compared to how many labels it has. Warren Weckesser and Salvador Dali / BHAT IRSHAD's methods (which are based on np.nonzero and np.where) all seem to scale linearly with the number of labels, whereas iterating through each image element with nditer obviously scales linearly with the size of labeled image.

The results of a small test:

size: 1000 x 1000, num_labels: 10
weckesser ... 0.214357852936s 
dali ... 0.650229930878s 
nditer ... 6.53645992279s 


size: 1000 x 1000, num_labels: 100
weckesser ... 0.936990022659s 
dali ... 1.33582305908s 
nditer ... 6.81486487389s 


size: 1000 x 1000, num_labels: 1000
weckesser ... 8.43906402588s 
dali ... 9.81333303452s 
nditer ... 7.47897100449s 


size: 1000 x 1000, num_labels: 10000
weckesser ... 100.405524015s 
dali ... 118.17239809s 
nditer ... 9.14583897591s

So the question becomes more specific:

For labeled images in which the number of labels is on the order of sqrt(size(image)) is there an algorithm to gather label coordinates that is faster than iterating through every image element (i.e. with nditer)?

Warren Weckesser · Accepted Answer

Here's a possibility:

import numpy as np

a = np.array([[0, 1, 0, 0, 0, 0],
              [0, 1, 0, 0, 0, 0],
              [0, 1, 0, 0, 0, 0],
              [0, 0, 0, 0, 3, 0],
              [2, 2, 0, 0, 0, 0],
              [2, 2, 0, 0, 0, 0]])

# If the array was computed using scipy.ndimage.measurements.label, you
# already know how many labels there are.
num_labels = 3

nz = np.nonzero(a)
coords = np.column_stack(nz)
nzvals = a[nz[0], nz[1]]
res = {k:coords[nzvals == k] for k in range(1, num_labels + 1)}

I called this script get_label_indices.py. Here's a sample run:

In [97]: import pprint

In [98]: run get_label_indices.py

In [99]: pprint.pprint(res)
{1: array([[0, 1],
       [1, 1],
       [2, 1]]),
 2: array([[4, 0],
       [4, 1],
       [5, 0],
       [5, 1]]),
 3: array([[3, 4]])}

Salvador Dali · Answer

You can do something like this (let img is your original nd.array)

res = {}
for i in np.unique(img)[1:]:
  x, y = np.where(a == i)
  res[i] = zip(list(x), list(y))

which will give you what you want:

{
 1: [(0, 1), (1, 1), (2, 1)],
 2: [(4, 0), (4, 1), (5, 0), (5, 1)],
 3: [(3, 4)]
}

Whether it will be faster - is up to the benchmark to determine.

Per Warren's suggestion, I do not need to use unique and can just do

res = {}
for i in range(1, num_labels + 1)
    x, y = np.where(a == i)
    res[i] = zip(list(x), list(y))

Fast way to turn a labeled image into a dictionary of { label : [coordinates] }

Tags:

python

image-processing

numpy

scipy

bennlich

2 Answers

Warren Weckesser

Salvador Dali

Recent Activity

Donate For Us

Fast way to turn a labeled image into a dictionary of { label : [coordinates] }

Tags:

python

image-processing

numpy

scipy

bennlich

2 Answers

Warren Weckesser

Salvador Dali

Related questions

Recent Activity

Donate For Us