Task: Classify images of human faces as female or male. Training images with labels are available, obtain the test image from webcam.
Using: Python 2.7, OpenCV 2.4.4
I am using ORB to extract features from a grayscale image which I hope to use for training a K-Nearest Neighbor classifier. Each training image is of a different person so the number of keypoints and descriptors for each image are obviously different. My problem is that I'm not able to understand the OpenCV docs for KNN and ORB. I've seen other SO questions about ORB, KNN and FLANN but they didn't help much.
What exactly is the nature of the descriptor given by ORB? How is it different than descriptors obtained by BRIEF, SURF, SIFT, etc.?
It seems that the feature descriptors should be of the same size for each training sample in KNN. How do I make sure that the descriptors are of the same size for each image? More generally, in what format should features be presented to KNN for training with given data and labels? Should the data be an int or float? Can it be char?
The training data can be found here.
I am also using the haarcascade_frontalface_alt.xml
from opencv samples
Right now the KNN model is given just 10 images for training to see if my program passes without errors which, it does not.
Here is my code:
import cv2
from numpy import float32 as np.float32
def chooseCascade():
# TODO: Option for diferent cascades
# HAAR Classifier for frontal face
_cascade = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
return _cascade
def cropToObj(cascade,imageFile):
# Load as 1-channel grayscale image
image = cv2.imread(imageFile,0)
# Crop to the object of interest in the image
objRegion = cascade.detectMultiScale(image) # TODO: What if multiple ojbects in image?
x1 = objRegion[0,0]
y1 = objRegion[0,1]
x1PlusWidth = objRegion[0,0]+objRegion[0,2]
y1PlusHeight = objRegion[0,1]+objRegion[0,3]
_objImage = image[y1:y1PlusHeight,x1:x1PlusWidth]
return _objImage
def recognizer(fileNames):
# ORB contructor
orb = cv2.ORB(nfeatures=100)
keyPoints = []
descriptors = []
# A cascade for face detection
haarFaceCascade = chooseCascade()
# Start processing images
for imageFile in fileNames:
# Find faces using the HAAR cascade
faceImage = cropToObj(haarFaceCascade,imageFile)
# Extract keypoints and description
faceKeyPoints, faceDescriptors = orb.detectAndCompute(faceImage, mask = None)
#print faceDescriptors.shape
descRow = faceDescriptors.shape[0]
descCol = faceDescriptors.shape[1]
flatFaceDescriptors = faceDescriptors.reshape(descRow*descCol).astype(np.float32)
keyPoints.append(faceKeyPoints)
descriptors.append(flatFaceDescriptors)
print descriptors
# KNN model and training on descriptors
responses = []
for name in fileNames:
if name.startswith('BF'):
responses.append(0) # Female
else:
responses.append(1) # Male
knn = cv2.KNearest()
knnTrainSuccess = knn.train(descriptors,
responses,
isRegression = False) # isRegression = false, implies classification
# Obtain test face image from cam
capture = cv2.VideoCapture(0)
closeCamera = -1
while(closeCamera < 0):
_retval, _camImage = capture.retrieve()
# Find face in camera image
testFaceImage = haarFaceCascade.detectMultiScale(_camImage) # TODO: What if multiple faces?
# Keyponts and descriptors of test face image
testFaceKP, testFaceDesc = orb.detectAndCompute(testFaceImage, mask = None)
testDescRow = testFaceDesc.shape[0]
flatTestFaceDesc = testFaceDesc.reshape(1,testDescRow*testDescCol).astype(np.float32)
# Args in knn.find_nearest: testData, neighborhood
returnedValue, result, neighborResponse, distance = knn.find_nearest(flatTestFaceDesc,3)
print returnedValue, result, neighborResponse, distance
# Display results
# TODO: Overlay classification text
cv2.imshow("testImage", _camImage)
closeCamera = cv2.waitKey(1)
cv2.destroyAllWindows()
if __name__ == '__main__':
fileNames = ['BF09NES_gray.jpg',
'BF11NES_gray.jpg',
'BF13NES_gray.jpg',
'BF14NES_gray.jpg',
'BF18NES_gray.jpg',
'BM25NES_gray.jpg',
'BM26NES_gray.jpg',
'BM29NES_gray.jpg',
'BM31NES_gray.jpg',
'BM34NES_gray.jpg']
recognizer(fileNames)
Currently I am getting an error at the line with knn.train()
where descriptors
is not detected as a numpy array.
Also, is this approach completely wrong? Am I supposed to use some other way for gender classification? I wasn't satisfied with the fisherface and eigenface example in the opencv facerec demo so please don't direct me to those.
Any other help is much appreciated. Thanks.
--- EDIT ---
I've tried a few things and come up with an answer.
I am still hoping that someone in SO community can help me by suggesting an idea so that I don't have to hardcode things into my solution. I also suspect that knn.match_nearest() isn't doing what I need it to do.
And as expected, the recognizer is not at all accurate and very prone to giving misclassification due to rotation, lighting, etc. Any suggestions on improving this approach would be really appreciated.
The database I am using for training is: Karolinska Directed Emotional Faces
i have some doubts on the effectiveness/workability of the described approach. here's a another approach that you might want to consider. the contents of gen
folder is @ http://www1.datafilehost.com/d/0f263abc. as you will note when the data size gets bigger(~10k training samples), the size of the model may become unacceptable(~100-200mb). then you will need to look into pca/lda etc.
import cv2
import numpy as np
import os
def feaCnt():
mat = np.zeros((400,400,3),dtype=np.uint8)
ret = extr(mat)
return len(ret)
def extr(img):
return sobel(img)
def sobel(img):
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
klr = [[-1,0,1],[-2,0,2],[-1,0,1]]
kbt = [[1,2,1],[0,0,0],[-1,-2,-1]]
ktb = [[-1,-2,-1],[0,0,0],[1,2,1]]
krl = [[1,0,-1],[2,0,-2],[1,0,-1]]
kd1 = [[0,1,2],[-1,0,1],[-2,-1,0]]
kd2 = [[-2,-1,0],[-1,0,1],[0,1,2]]
kd3 = [[0,-1,-2],[1,0,-1],[2,1,0]]
kd4 = [[2,1,0],[1,0,-1],[0,-1,-2]]
karr = np.asanyarray([
klr,
kbt,
ktb,
krl,
kd1,
kd2,
kd3,
kd4
])
gray=cv2.resize(gray,(40,40))
res = np.float32([cv2.resize(cv2.filter2D(gray, -1,k),(15,15)) for k in karr])
return res.flatten()
root = 'C:/data/gen'
model='c:/data/models/svm/gen.xml'
imgs = []
idx =0
for path, subdirs, files in os.walk(root):
for name in files:
p =path[len(root):].split('\\')
p.remove('')
lbl = p[0]
fpath = os.path.join(path, name)
imgs.append((fpath,int(lbl)))
idx+=1
samples = np.zeros((len(imgs),feaCnt()),dtype = np.float32)
labels = np.zeros(len(imgs),dtype = np.float32)
i=0.
for f,l in imgs:
print i
img = cv2.imread(f)
samples[i]=extr(img)
labels[i]=l
i+=1
svm = cv2.SVM()
svmparams = dict( kernel_type = cv2.SVM_POLY,
svm_type = cv2.SVM_C_SVC,
degree=3.43,
gamma=1.5e-4,
coef0=1e-1,
)
print 'svm train'
svm.train(samples,labels,params=svmparams)
svm.save(model)
print 'done'
result = np.float32( [(svm.predict(s)) for s in samples])
correct=0.
total=0.
for i,j in zip(result,labels):
total+=1
if i==j:
correct+=1
print '%f'%(correct/total)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With