Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCV, Python: How to use mask parameter in ORB feature detector

By reading a few answers on stackoverflow, I've learned this much so far:

The mask has to be a numpy array (which has the same shape as the image) with data type CV_8UC1 and have values from 0 to 255.

What is the meaning of these numbers, though? Is it that any pixels with a corresponding mask value of zero will be ignored in the detection process and any pixels with a mask value of 255 will be used? What about the values in between?

Also, how do I initialize a numpy array with data type CV_8UC1 in python? Can I just use dtype=cv2.CV_8UC1

Here is the code I am using currently, based on the assumptions I'm making above. But the issue is that I don't get any keypoints when I run detectAndCompute for either image. I have a feeling it might be because the mask isn't the correct data type. If I'm right about that, how do I correct it?

# convert images to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGRA2GRAY)
curr_gray = cv2.cvtColor(self.curr, cv2.COLOR_BGRA2GRAY)

# initialize feature detector
detector = cv2.ORB_create()

# create a mask using the alpha channel of the original image--don't
# use transparent or partially transparent parts
base_cond = self.base[:,:,3] == 255
base_mask = np.array(np.where(base_cond, 255, 0))

curr_cond = self.base[:,:,3] == 255
curr_mask = np.array(np.where(curr_cond, 255, 0), dtype=np.uint8)

# use the mask and grayscale images to detect good features
base_keys, base_desc = detector.detectAndCompute(base_gray, mask=base_mask)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, mask=curr_mask)

 print("base keys: ", base_keys)
 # []
 print("curr keys: ", curr_keys)
 # []
like image 915
Victor Odouard Avatar asked Aug 22 '17 06:08

Victor Odouard


People also ask

How does mask work in OpenCV?

Inverting a mask basically inverts the whole process, that is, the pixels in the highlighted portion become 0 and all other pixels remain non-zero. For this purpose, we perform bitwise not on each pixel to transpose(invert) its value. To invert a mask in OpenCV, we use the cv2.


1 Answers

So here is most, if not all, of the answer:

What is the meaning of those numbers

0 means to ignore the pixel and 255 means to use it. I'm still unclear on the values in between, but I don't think all nonzero values are considered "equivalent" to 255 in the mask. See here.

Also, how do I initialize a numpy array with data type CV_8UC1 in python?

The type CV_8U is the unsigned 8-bit integer, which, using numpy, is numpy.uint8. The C1 postfix means that the array is 1-channel, instead of 3-channel for color images and 4-channel for rgba images. So, to create a 1-channel array of unsigned 8-bit integers:

import numpy as np
np.zeros((480, 720), dtype=np.uint8)

(a three-channel array would have shape (480, 720, 3), four-channel (480, 720, 4), etc.) This mask would cause the detector and extractor to ignore the entire image, though, since it's all zeros.

how do I correct [the code]?

There were two separate issues, each separately causing each keypoint array to be empty.

First, I forgot to set the type for the base_mask

base_mask = np.array(np.where(base_cond, 255, 0)) # wrong
base_mask = np.array(np.where(base_cond, 255, 0), dtype=uint8) # right

Second, I used the wrong image to generate my curr_cond array:

curr_cond = self.base[:,:,3] == 255 # wrong
curr_cond = self.curr[:,:,3] == 255 # right

Some pretty dumb mistakes.

Here is the full corrected code:

# convert images to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGRA2GRAY)
curr_gray = cv2.cvtColor(self.curr, cv2.COLOR_BGRA2GRAY)

# initialize feature detector
detector = cv2.ORB_create()

# create a mask using the alpha channel of the original image--don't
# use transparent or partially transparent parts
base_cond = self.base[:,:,3] == 255
base_mask = np.array(np.where(base_cond, 255, 0), dtype=np.uint8)

curr_cond = self.curr[:,:,3] == 255
curr_mask = np.array(np.where(curr_cond, 255, 0), dtype=np.uint8)

# use the mask and grayscale images to detect good features
base_keys, base_desc = detector.detectAndCompute(base_gray, mask=base_mask)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, mask=curr_mask)

TL;DR: The mask parameter is a 1-channel numpy array with the same shape as the grayscale image in which you are trying to find features (if image shape is (480, 720), so is mask).

The values in the array are of type np.uint8, 255 means "use this pixel" and 0 means "don't"

Thanks to Dan Mašek for leading me to parts of this answer.

like image 193
Victor Odouard Avatar answered Oct 17 '22 21:10

Victor Odouard