Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python, OpenCV: classify gender using ORB features and KNN

Task: Classify images of human faces as female or male. Training images with labels are available, obtain the test image from webcam.

Using: Python 2.7, OpenCV 2.4.4

I am using ORB to extract features from a grayscale image which I hope to use for training a K-Nearest Neighbor classifier. Each training image is of a different person so the number of keypoints and descriptors for each image are obviously different. My problem is that I'm not able to understand the OpenCV docs for KNN and ORB. I've seen other SO questions about ORB, KNN and FLANN but they didn't help much.

What exactly is the nature of the descriptor given by ORB? How is it different than descriptors obtained by BRIEF, SURF, SIFT, etc.?

It seems that the feature descriptors should be of the same size for each training sample in KNN. How do I make sure that the descriptors are of the same size for each image? More generally, in what format should features be presented to KNN for training with given data and labels? Should the data be an int or float? Can it be char?

The training data can be found here.

I am also using the haarcascade_frontalface_alt.xml from opencv samples

Right now the KNN model is given just 10 images for training to see if my program passes without errors which, it does not.

Here is my code:

import cv2
from numpy import float32 as np.float32

def chooseCascade():
    # TODO: Option for diferent cascades
    # HAAR Classifier for frontal face
    _cascade = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
    return _cascade

def cropToObj(cascade,imageFile):
    # Load as 1-channel grayscale image
    image = cv2.imread(imageFile,0)

    # Crop to the object of interest in the image
    objRegion = cascade.detectMultiScale(image) # TODO: What if multiple ojbects in image?

    x1 = objRegion[0,0]
    y1 = objRegion[0,1]
    x1PlusWidth = objRegion[0,0]+objRegion[0,2]
    y1PlusHeight = objRegion[0,1]+objRegion[0,3]

    _objImage = image[y1:y1PlusHeight,x1:x1PlusWidth]

    return _objImage

def recognizer(fileNames):
    # ORB contructor
    orb = cv2.ORB(nfeatures=100)

    keyPoints = []
    descriptors = [] 

    # A cascade for face detection
    haarFaceCascade = chooseCascade()

    # Start processing images
    for imageFile in fileNames:
        # Find faces using the HAAR cascade
        faceImage = cropToObj(haarFaceCascade,imageFile)

        # Extract keypoints and description 
        faceKeyPoints, faceDescriptors = orb.detectAndCompute(faceImage, mask = None)

        #print faceDescriptors.shape
        descRow = faceDescriptors.shape[0]
        descCol = faceDescriptors.shape[1]

        flatFaceDescriptors = faceDescriptors.reshape(descRow*descCol).astype(np.float32)

        keyPoints.append(faceKeyPoints)
        descriptors.append(flatFaceDescriptors)

    print descriptors

    # KNN model and training on descriptors
    responses = []
    for name in fileNames:
        if name.startswith('BF'):
            responses.append(0) # Female
        else:
            responses.append(1) # Male

    knn = cv2.KNearest()
    knnTrainSuccess = knn.train(descriptors,
                                responses,
                                isRegression = False) # isRegression = false, implies classification

    # Obtain test face image from cam
    capture = cv2.VideoCapture(0)
    closeCamera = -1
    while(closeCamera < 0):
        _retval, _camImage = capture.retrieve()      

        # Find face in camera image
        testFaceImage = haarFaceCascade.detectMultiScale(_camImage) # TODO: What if multiple faces?

        # Keyponts and descriptors of test face image
        testFaceKP, testFaceDesc = orb.detectAndCompute(testFaceImage, mask = None)
        testDescRow = testFaceDesc.shape[0]
        flatTestFaceDesc = testFaceDesc.reshape(1,testDescRow*testDescCol).astype(np.float32) 

        # Args in knn.find_nearest: testData, neighborhood
        returnedValue, result, neighborResponse, distance = knn.find_nearest(flatTestFaceDesc,3) 

        print returnedValue, result, neighborResponse, distance


        # Display results
        # TODO: Overlay classification text
        cv2.imshow("testImage", _camImage)

        closeCamera = cv2.waitKey(1)
    cv2.destroyAllWindows()


if __name__ == '__main__':
    fileNames = ['BF09NES_gray.jpg', 
                 'BF11NES_gray.jpg', 
                 'BF13NES_gray.jpg', 
                 'BF14NES_gray.jpg', 
                 'BF18NES_gray.jpg', 
                 'BM25NES_gray.jpg', 
                 'BM26NES_gray.jpg', 
                 'BM29NES_gray.jpg', 
                 'BM31NES_gray.jpg', 
                 'BM34NES_gray.jpg']

    recognizer(fileNames)

Currently I am getting an error at the line with knn.train() where descriptors is not detected as a numpy array.

Also, is this approach completely wrong? Am I supposed to use some other way for gender classification? I wasn't satisfied with the fisherface and eigenface example in the opencv facerec demo so please don't direct me to those.

Any other help is much appreciated. Thanks.

--- EDIT ---

I've tried a few things and come up with an answer.

I am still hoping that someone in SO community can help me by suggesting an idea so that I don't have to hardcode things into my solution. I also suspect that knn.match_nearest() isn't doing what I need it to do.

And as expected, the recognizer is not at all accurate and very prone to giving misclassification due to rotation, lighting, etc. Any suggestions on improving this approach would be really appreciated.

The database I am using for training is: Karolinska Directed Emotional Faces

like image 437
samkhan13 Avatar asked Nov 02 '22 13:11

samkhan13


1 Answers

i have some doubts on the effectiveness/workability of the described approach. here's a another approach that you might want to consider. the contents of gen folder is @ http://www1.datafilehost.com/d/0f263abc. as you will note when the data size gets bigger(~10k training samples), the size of the model may become unacceptable(~100-200mb). then you will need to look into pca/lda etc.

import cv2
import numpy as np
import os

def feaCnt():
    mat = np.zeros((400,400,3),dtype=np.uint8)
    ret = extr(mat)
    return len(ret)

def extr(img):
    return sobel(img)

def sobel(img):
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    klr = [[-1,0,1],[-2,0,2],[-1,0,1]]
    kbt = [[1,2,1],[0,0,0],[-1,-2,-1]]
    ktb = [[-1,-2,-1],[0,0,0],[1,2,1]]
    krl = [[1,0,-1],[2,0,-2],[1,0,-1]]
    kd1 = [[0,1,2],[-1,0,1],[-2,-1,0]]
    kd2 = [[-2,-1,0],[-1,0,1],[0,1,2]]    
    kd3 = [[0,-1,-2],[1,0,-1],[2,1,0]]
    kd4 = [[2,1,0],[1,0,-1],[0,-1,-2]]
    karr = np.asanyarray([
        klr,
        kbt,
        ktb,
        krl,
        kd1,
        kd2,
        kd3,
        kd4
        ])
    gray=cv2.resize(gray,(40,40))
    res =  np.float32([cv2.resize(cv2.filter2D(gray, -1,k),(15,15)) for k in karr])
    return res.flatten()


root = 'C:/data/gen'

model='c:/data/models/svm/gen.xml'
imgs = []
idx =0
for path, subdirs, files in os.walk(root):
  for name in files:  
    p =path[len(root):].split('\\')
    p.remove('')
    lbl = p[0]
    fpath = os.path.join(path, name)
    imgs.append((fpath,int(lbl)))
    idx+=1

samples = np.zeros((len(imgs),feaCnt()),dtype = np.float32)
labels = np.zeros(len(imgs),dtype = np.float32)

i=0.
for f,l in imgs:
  print i
  img = cv2.imread(f)
  samples[i]=extr(img)
  labels[i]=l
  i+=1

svm = cv2.SVM()
svmparams = dict( kernel_type = cv2.SVM_POLY, 
                       svm_type = cv2.SVM_C_SVC,
                       degree=3.43,
                       gamma=1.5e-4,
                       coef0=1e-1,
                       )
print 'svm train'
svm.train(samples,labels,params=svmparams)
svm.save(model)
print 'done'

result = np.float32( [(svm.predict(s)) for s in samples])
correct=0.
total=0.

for i,j in zip(result,labels):
    total+=1
    if i==j:
      correct+=1
    print '%f'%(correct/total)
like image 130
Zaw Lin Avatar answered Nov 15 '22 05:11

Zaw Lin