Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove empty 'rows' and 'columns' from 3D numpy pixel array

I essentially want to crop an image with numpy—I have a 3-dimension numpy.ndarray object, ie:

[ [0,0,0,0], [255,255,255,255], ....]
  [0,0,0,0], [255,255,255,255], ....] ]

where I want to remove whitespace, which, in context, is known to be either entire rows or entire columns of [0,0,0,0].

Letting each pixel just be a number for this example, I'm trying to essentially do this:

Given this: *EDIT: chose a slightly more complex example to clarify

[ [0,0,0,0,0,0] [0,0,1,1,1,0] [0,1,1,0,1,0] [0,0,0,1,1,0] [0,0,0,0,0,0]]

I'm trying to create this:

[ [0,1,1,1], [1,1,0,1], [0,0,1,1] ]

I can brute force this with loops, but intuitively I feel like numpy has a better means of doing this.

like image 712
Jonline Avatar asked Dec 14 '25 05:12

Jonline


1 Answers

In general, you'd want to look into scipy.ndimage.label and scipy.ndimage.find_objects to extract the bounding box of contiguous regions fulfilling a condition.

However, in this case, you can do it fairly easily with "plain" numpy.

I'm going to assume you have a nrows x ncols x nbands array here. The other convention of nbands x nrows x ncols is also quite common, so have a look at the shape of your array.

With that in mind, you might do something similar to:

mask = im == 0
all_white = mask.sum(axis=2) == 0
rows = np.flatnonzero((~all_white).sum(axis=1))
cols = np.flatnonzero((~all_white).sum(axis=0))

crop = im[rows.min():rows.max()+1, cols.min():cols.max()+1, :]

For your 2D example, it would look like:

import numpy as np

im = np.array([[0,0,0,0,0,0],
               [0,0,1,1,1,0],
               [0,1,1,0,1,0],
               [0,0,0,1,1,0],
               [0,0,0,0,0,0]])

mask = im == 0
rows = np.flatnonzero((~mask).sum(axis=1))
cols = np.flatnonzero((~mask).sum(axis=0))

crop = im[rows.min():rows.max()+1, cols.min():cols.max()+1]
print crop

Let's break down the 2D example a bit.

In [1]: import numpy as np

In [2]: im = np.array([[0,0,0,0,0,0],
   ...:                [0,0,1,1,1,0],
   ...:                [0,1,1,0,1,0],
   ...:                [0,0,0,1,1,0],
   ...:                [0,0,0,0,0,0]])

Okay, now let's create a boolean array that meets our condition:

In [3]: mask = im == 0

In [4]: mask
Out[4]:
array([[ True,  True,  True,  True,  True,  True],
       [ True,  True, False, False, False,  True],
       [ True, False, False,  True, False,  True],
       [ True,  True,  True, False, False,  True],
       [ True,  True,  True,  True,  True,  True]], dtype=bool)

Also, note that the ~ operator functions as logical_not on boolean arrays:

In [5]: ~mask
Out[5]:
array([[False, False, False, False, False, False],
       [False, False,  True,  True,  True, False],
       [False,  True,  True, False,  True, False],
       [False, False, False,  True,  True, False],
       [False, False, False, False, False, False]], dtype=bool)

With that in mind, to find rows where all elements are false, we can sum across columns:

In [6]: (~mask).sum(axis=1)
Out[6]: array([0, 3, 3, 2, 0])

If no elements are True, we'll get a 0.

And similarly to find columns where all elements are false, we can sum across rows:

In [7]: (~mask).sum(axis=0)
Out[7]: array([0, 1, 2, 2, 3, 0])

Now all we need to do is find the first and last of these that are not zero. np.flatnonzero is a bit easier than nonzero, in this case:

In [8]: np.flatnonzero((~mask).sum(axis=1))
Out[8]: array([1, 2, 3])

In [9]: np.flatnonzero((~mask).sum(axis=0))
Out[9]: array([1, 2, 3, 4])

Then, you can easily slice out the region based on min/max nonzero elements:

In [10]: rows = np.flatnonzero((~mask).sum(axis=1))

In [11]: cols = np.flatnonzero((~mask).sum(axis=0))

In [12]: im[rows.min():rows.max()+1, cols.min():cols.max()+1]
Out[12]:
array([[0, 1, 1, 1],
       [1, 1, 0, 1],
       [0, 0, 1, 1]])
like image 94
Joe Kington Avatar answered Dec 15 '25 18:12

Joe Kington



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!