Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computer Vision: Masking a human hand

I'd like to detect my hand from a live video stream and create a mask of my hand. However I'm reaching quite a poor result, as you can see from the picture.

My goal is to track the hand movement, so what I did was convert the video stream from BGR to HSV color space then I thresholded the image in order to isolate the color of my hand, then I tried to find the contours of my hand although the final result isn't quite what I wanted to achieve.

How could I improve the end result?

import cv2
import numpy as np

cam = cv2.VideoCapture(1)
cam.set(3,640)
cam.set(4,480)
ret, image = cam.read()

skin_min = np.array([0, 40, 150],np.uint8)
skin_max = np.array([20, 150, 255],np.uint8)    
while True:
    ret, image = cam.read()

    gaussian_blur = cv2.GaussianBlur(image,(5,5),0)
    blur_hsv = cv2.cvtColor(gaussian_blur, cv2.COLOR_BGR2HSV)

#threshould using min and max values
    tre_green = cv2.inRange(blur_hsv, skin_min, skin_max)
#getting object green contour
    contours, hierarchy = cv2.findContours(tre_green,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

#draw contours
    cv2.drawContours(image,contours,-1,(0,255,0),3)

    cv2.imshow('real', image)
    cv2.imshow('tre_green', tre_green)   

    key = cv2.waitKey(10)
    if key == 27:
        break

Here the link with the pictures: https://picasaweb.google.com/103610822612915300423/February7201303. New link with image plus contours, mask, and original. https://picasaweb.google.com/103610822612915300423/February7201304

And here's a sample picture from above:

Sample picture of a torso with arms ... and a hand

like image 611
wind85 Avatar asked Feb 07 '13 13:02

wind85


People also ask

What is masking in computer vision?

Masking is an image processing method in which we define a small 'image piece' and use it to modify a larger image. Masking is the process that is underneath many types of image processing, including edge detection, motion detection, and noise reduction.

What is CV mask?

Masking is a common technique to extract the Region of Interest (ROI). In openCV, it is possible to construct arbitrary masking shape using draw function and bitwise operation.

How do you mask an image?

Quick steps for creating a clipping mask: Select a text or graphic layer to fill with an image. Click Fill with image on the tool palette & choose an image. Select Edit image fill on the Text Tools panel. Adjust the image behind your text or shapes, then click Done.

How do I extract an image from a mask?

We first create a binary mask from a specified ROI, which returns a matrix containing only 1 or 0. To return the selected portion of the image, we perform a logical AND of the binary mask matrix with the image using logical indexing. The resulting image will contain the extracted portion.


2 Answers

There are many ways to perform pixel-wise threshold to separate "skin pixels" from "non-skin pixels", and there are papers based on virtually any colorspace (even with RGB). So, my answer is simply based on the paper Face Segmentation Using Skin-Color Map in Videophone Applications by Chai and Ngan. They worked with the YCbCr colorspace and got quite nice results, the paper also mentions a threshold that worked well for them:

(Cb in [77, 127]) and (Cr in [133, 173])

The thresholds for the Y channel are not specified, but there are papers that mention Y > 80. For your single image, Y in the whole range is fine, i.e. it doesn't matter for actually distinguishing skin.

Here is the input, the binary image according to the thresholds mentioned, and the resulting image after discarding small components.

enter image description hereenter image description hereenter image description here

import sys
import numpy
import cv2

im = cv2.imread(sys.argv[1])
im_ycrcb = cv2.cvtColor(im, cv2.COLOR_BGR2YCR_CB)

skin_ycrcb_mint = numpy.array((0, 133, 77))
skin_ycrcb_maxt = numpy.array((255, 173, 127))
skin_ycrcb = cv2.inRange(im_ycrcb, skin_ycrcb_mint, skin_ycrcb_maxt)
cv2.imwrite(sys.argv[2], skin_ycrcb) # Second image

contours, _ = cv2.findContours(skin_ycrcb, cv2.RETR_EXTERNAL, 
        cv2.CHAIN_APPROX_SIMPLE)
for i, c in enumerate(contours):
    area = cv2.contourArea(c)
    if area > 1000:
        cv2.drawContours(im, contours, i, (255, 0, 0), 3)
cv2.imwrite(sys.argv[3], im)         # Final image

Lastly, there are a quite decent amount of papers that do not rely on individual pixel-wise classification for this task. Instead, they start from a base of labeled images that are known to contain either skin pixels or non-skin pixels. From that they train, for example, a SVM and then distinguish other inputs based on this classifier.

like image 179
mmgp Avatar answered Oct 08 '22 05:10

mmgp


A simple and powerful option is histogram backprojection. For example, create a 2D histogram using H and S (from HSV color space) or a* and b* (from La*b* color space), using pixels from different training images of your hand. Then use [cv2.calcBackProject][1] to classify the pixels in your stream. It's very fast and you should get 25 to 30 fps easily, I guess. Note this is a way to learn the color distribution of your object of interest. The same method can be used in other situations.

like image 24
TH. Avatar answered Oct 08 '22 04:10

TH.