I am using the measure.regionprops method available in scikit-image to measure the properties of the connected components. It computes a bunch of properties (Python-regionprops). However, I just need the area of each connected component. Is there a way to compute just a single property and save computation?
There seems to be a more direct way to do the same thing using regionprops
with cache=False
. I generated labels using skimage.segmentation.slic
with n_segments=10000
. Then:
rps = regionprops(labels, cache=False)
[r.area for r in rps]
My understanding of the regionprops documentation is that setting cache=False
means that the attributes won't be calculated until they're called. According to %%time
in Jupyter notebook, running the code above took 166ms with cache=False
vs 247ms with cache=True
, so it seems to work.
I tried an equivalent of the other answer and found it much slower.
%%time
ard = np.empty(10000, dtype=int)
for i in range(10000):
ard[i] = size(np.where(labels==0)[1])
That took 34.3 seconds.
Here's a full working example comparing the two methods using the skimage
astronaut sample image and labels generated by slic segmentation:
import numpy as np
import skimage
from skimage.segmentation import slic
from skimage.data import astronaut
img = astronaut()
# `+ 1` is added to avoid a region with the label of `0`
# zero is considered unlabeled so isn't counted by regionprops
# but would be counted by the other method.
segments = slic(img, n_segments=1000, compactness=10) + 1
# This is just to make it more like the original poster's
# question.
labels, num = skimage.measure.label(segments, return_num=True)
Calculate areas using the OP's suggested method with index values adjusted to avoid the having a zero label:
%%time
area = {}
for i in range(1,num + 1):
area[i + 1] = np.size(np.where(labels==i)[1])
CPU times: user 512 ms, sys: 0 ns, total: 512 ms
Wall time: 506 ms
Same calculation using regionprops:
%%time
rps = skimage.measure.regionprops(labels, cache=False)
area2 = [r.area for r in rps]
CPU times: user 16.6 ms, sys: 0 ns, total: 16.6 ms
Wall time: 16.2 ms
Verify that the results are all equal element-wise:
np.equal(area.values(), area2).all()
True
So, as long as zero labels and the difference in indexing is accounted for, both methods give the same result but regionprops without caching is faster.
I found a way for avoiding using regionprops and computing all the properties when all we need is the area of the connected components. When the labelling of the connected component is done using the label command, we can compute the size of each component by computing the number of pixels with a given label. So, basically
labels,num=label(image, return_num=True)
for i in range(num):
area[i]=size(np.where(labels==i)[1])
will compute the number of pixels in each connected component.
@optimist
Your non-regionprops method showed some inefficiencies for me. It picked up some unwanted noise and incorrectly calculated one of the shapes
import numpy as np
from skimage.measure import label, regionprops
import matplotlib.pyplot as plt
arr = np.array([[1,0,1,0,0,0,1],
[1,1,1,0,0,0,1],
[0,1,1,0,0,0,1],
[0,1,1,0,0,1,1],
[0,0,0,0,1,1,1],
[0,0,0,1,1,1,1],
[1,0,0,1,1,1,1],
[1,0,0,1,1,1,1],
[1,0,0,1,1,1,1]])
area = {}
labels, num = label(arr, return_num=True)
for i in range(num):
print(i)
area[i]=np.size(np.where(labels==i)[1])
print(area[i])
plt.imshow(labels)
plt.show();
rps = regionprops(labels, cache=False)
[r.area for r in rps]
Out: [9, 24, 3]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With