Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The fastest way to exclude surrounding zeros from an array representing an image?

I have a 2D array containing grayscale image created from .png as follows:

import cv2

img = cv2.imread("./images/test.png")
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

What I would like to do is to extract a subarray containing only the rectangle containing the data - ignoring all zeros surrounding the picture.

For example, if input is:

  0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0
  0   0 175   0   0   0  71   0
  0   0   0  12   8  54   0   0
  0   0   0   0 255   0   0   0
  0   0   0   2   0   0   0   0
  0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0

then output should be:

175   0   0   0  71
  0  12   8  54   0
  0   0 255   0   0
  0   2   0   0   0

I could iterate over the rows in forward direction to find the first nonzero row and then iterate over the rows backwards to find the last nonzero row remembering indices - and then repeat the same for the columns and then extract a subarray using that data but I am sure there are more appropriate ways of doing the same or there even might be a NumPy function designed for such a purpose.

If I were to choose between shortest code vs fastest execution I'd be more interested in fastest code execution.

EDIT:
I did not include the best example because there could be zero rows/columns in the middle as here:

Input:

  0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0
  0   0 175   0   0   0  71   0
  0   0   0  12   8  54   0   0
  0   0   0   0 255   0   0   0
  0   0   0   0   0   0   0   0
  0   0   0   2   0   0   0   0
  0   0   0   0   0   0   0   0

Output:

175   0   0   0  71
  0  12   8  54   0
  0   0 255   0   0
  0   0   0   0   0
  0   2   0   0   0
like image 868
Chupo_cro Avatar asked Mar 25 '18 05:03

Chupo_cro


1 Answers

Note, not an OpenCV solution - this will work for n-dimensional NumPy or SciPy arrays in general.

(Based on Divakar's answer, extended to n dimensions)

def crop_new(arr):

    mask = arr != 0
    n = mask.ndim
    dims = range(n)
    slices = [None]*n

    for i in dims:
        mask_i = mask.any(tuple(dims[:i] + dims[i+1:]))
        slices[i] = (mask_i.argmax(), len(mask_i) - mask_i[::-1].argmax())

    return arr[[slice(*s) for s in slices]]

Speed tests:

In [42]: np.random.seed(0)

In [43]: a = np.zeros((30, 30, 30, 20),dtype=np.uint8)

In [44]: a[2:-2, 2:-2, 2:-2, 2:-2] = np.random.randint(0,255,(26,26,26,16),dtype
=np.uint8)

In [45]: timeit crop(a) # Old solution
1 loop, best of 3: 181 ms per loop

In [46]: timeit crop_fast(a) # modified fireant's solution for n-dimensions
100 loops, best of 3: 5 ms per loop

In [48]: timeit crop_new(a) # modified Divakar's solution for n-dimensions
100 loops, best of 3: 1.91 ms per loop

Old solution

You can use np.nonzero to get the indices of the array. The bounding box of this array are then contained entirely in the maximum and minimum values of the indices.

def _get_slice_bbox(arr):
    nonzero = np.nonzero(arr)
    return [(min(a), max(a)+1) for a in nonzero]

def crop(arr):
    slice_bbox = _get_slice_bbox(arr)
    return arr[[slice(*a) for a in slice_bbox]]

E.g.

>>> img = np.array([[  0,   0,   0,   0,   0,   0,   0,   0],
                    [  0,   0,   0,   0,   0,   0,   0,   0],
                    [  0,   0, 175,   0,   0,   0,  71,   0],
                    [  0,   0,   0,  12,   8,  54,   0,   0],
                    [  0,   0,   0,   0, 255,   0,   0,   0],
                    [  0,   0,   0,   2,   0,   0,   0,   0],
                    [  0,   0,   0,   0,   0,   0,   0,   0],
                    [  0,   0,   0,   0,   0,   0,   0,   0]],  dtype='uint8')
>>> print crop(img)
[[175   0   0   0  71]
 [  0  12   8  54   0]
 [  0   0 255   0   0]
 [  0   2   0   0   0]]
like image 72
dROOOze Avatar answered Nov 03 '22 02:11

dROOOze