Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get better results with OpenCV face recognition Module

I'm trying to use OpenCV's face recognition module to recognize 2 subjects from a video. I cropped 30 face images of the first subject and 20 face images of the second subject from the video and I use these as my training set.

I've tested all three approaches (Eigenfaces, Fisherfaces and LBP histograms), but I'm not getting good results in neither of the approaches. Sometimes the first subject is classified as the second subject and vice-verse, sometimes false detections are classified as one of the two subjects and sometimes other people in the video are classified as one of the two subjects.

How can I improve performance? Would enlarging the training set help in improving the results? Are there any other packages I can consider that performs face recognition in C++? I think it should be an easy task as I'm trying to recognize only two different subjects.

Here is my code (I'm using OpenCV 2.4.7 on windows 8 with VS2012):

#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/contrib/contrib.hpp"


#include <iostream>
#include <stdio.h>
#include <fstream>
#include <sstream>

#define EIGEN 0
#define FISHER 0
#define LBPH 1;
using namespace std;
using namespace cv;

/** Function Headers */
void detectAndDisplay( Mat frame , int i,Ptr<FaceRecognizer> model);


static Mat toGrayscale(InputArray _src) {
    Mat src = _src.getMat();
    // only allow one channel
    if(src.channels() != 1) {
        CV_Error(CV_StsBadArg, "Only Matrices with one channel are supported");
    }
    // create and return normalized image
    Mat dst;
    cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1);
    return dst;
}


static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
    std::ifstream file(filename.c_str(), ifstream::in);
    if (!file) {
        string error_message = "No valid input file was given, please check the given filename.";
        CV_Error(CV_StsBadArg, error_message);
    }
    string line, path, classlabel;
    while (getline(file, line)) {
        stringstream liness(line);
        getline(liness, path, separator);
        getline(liness, classlabel);
        if(!path.empty() && !classlabel.empty()) {
            images.push_back(imread(path, 0));
            labels.push_back(atoi(classlabel.c_str()));
        }
    }
}

/** Global variables */
String face_cascade_name = "C:\\OIM\\code\\OIM2 - face detection\\Debug\\haarcascade_frontalface_alt.xml";
//String face_cascade_name = "C:\\OIM\\code\\OIM2 - face detection\\Debug\\NewCascade.xml";
//String face_cascade_name = "C:\\OIM\\code\\OIM2 - face detection\\Debug\\haarcascade_eye_tree_eyeglasses.xml";

String eyes_cascade_name = "C:\\OIM\\code\\OIM2 - face detection\\Debug\\haarcascade_eye_tree_eyeglasses.xml";
CascadeClassifier face_cascade;
CascadeClassifier eyes_cascade;
string window_name = "Capture - Face detection";
RNG rng(12345);

/** @function main */
int main( int argc, const char** argv )
{

     string fn_csv = "C:\\OIM\\faces_org.csv";

     // These vectors hold the images and corresponding labels.
    vector<Mat> images;
    vector<int> labels;
    // Read in the data. This can fail if no valid
    // input filename is given.
    try {
        read_csv(fn_csv, images, labels);
    } catch (cv::Exception& e) {
        cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
        // nothing more we can do
        exit(1);
    }
    // Quit if there are not enough images for this demo.
    if(images.size() <= 1) {
        string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
        CV_Error(CV_StsError, error_message);
    }
    // Get the height from the first image. We'll need this
    // later in code to reshape the images to their original
    // size:
    int height = images[0].rows;

      // The following lines create an Eigenfaces model for
    // face recognition and train it with the images and
    // labels read from the given CSV file.
    // This here is a full PCA, if you just want to keep
    // 10 principal components (read Eigenfaces), then call
    // the factory method like this:
    //
    //      cv::createEigenFaceRecognizer(10);
    //
    // If you want to create a FaceRecognizer with a
    // confidennce threshold, call it with:
    //
    //      cv::createEigenFaceRecognizer(10, 123.0);
    //
    //Ptr<FaceRecognizer> model = createEigenFaceRecognizer();
#if EIGEN
    Ptr<FaceRecognizer> model = createEigenFaceRecognizer(10,2000000000);
#elif FISHER
    Ptr<FaceRecognizer> model = createFisherFaceRecognizer(0, 200000000);
#elif LBPH
    Ptr<FaceRecognizer> model =createLBPHFaceRecognizer(1,8,8,8,200000000);
#endif
    model->train(images, labels);


    Mat frame;

    //-- 1. Load the cascades
    if( !face_cascade.load( face_cascade_name ) ){ printf("--(!)Error loading\n"); return -1; };
    if( !eyes_cascade.load( eyes_cascade_name ) ){ printf("--(!)Error loading\n"); return -1; };

    // Get the frame rate
    bool stop(false);
    int count=1;

    char filename[512];
    for (int i=1;i<=517;i++){
        sprintf(filename,"C:\\OIM\\original_frames2\\image%d.jpg",i);
        Mat frame=imread(filename);

        detectAndDisplay(frame,i,model);
        waitKey(0);
    }
    return 0;
}

/** @function detectAndDisplay */
void detectAndDisplay( Mat frame ,int i, Ptr<FaceRecognizer> model)
{

    std::vector<Rect> faces;
    Mat frame_gray;

    cvtColor( frame, frame_gray, CV_BGR2GRAY );
    equalizeHist( frame_gray, frame_gray );

    //-- Detect faces
    //face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
    face_cascade.detectMultiScale( frame_gray, faces, 1.1, 1, 0|CV_HAAR_SCALE_IMAGE, Size(10, 10) );



    for( size_t i = 0; i < faces.size(); i++ )
    {
        Rect roi = Rect(faces[i].x,faces[i].y,faces[i].width,faces[i].height);
        Mat face=frame_gray(roi);
        resize(face,face,Size(200,200));
         int predictedLabel = -1;
        double confidence = 0.0;
        model->predict(face, predictedLabel, confidence);

        //imshow("gil",face);
        //waitKey(0);
#if EIGEN
        int M=10000;
#elif FISHER
        int M=500;
#elif LBPH
        int M=300;
#endif
        Point center( faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5 );
        if ((predictedLabel==1)&& (confidence<M)) 
            ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 0, 0, 255 ), 4, 8, 0 );
        if ((predictedLabel==0)&& (confidence<M)) 
            ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 0), 4, 8, 0 );
        if  (confidence>M) 
            ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 0, 255, 0), 4, 8, 0 );

        Mat faceROI = frame_gray( faces[i] );
        std::vector<Rect> eyes;

        //-- In each face, detect eyes
        eyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );

        for( size_t j = 0; j < eyes.size(); j++ )
        {
            Point center( faces[i].x + eyes[j].x + eyes[j].width*0.5, faces[i].y + eyes[j].y + eyes[j].height*0.5 );
            int radius = cvRound( (eyes[j].width + eyes[j].height)*0.25 );
            //circle( frame, center, radius, Scalar( 255, 0, 0 ), 4, 8, 0 );
        }
    }
    //-- Show what you got
    //imshow( window_name, frame );
    char filename[512];
    sprintf(filename,"C:\\OIM\\FaceRecognitionResults\\image%d.jpg",i);
    imwrite(filename,frame);
}

Thanks in advance,

Gil.

like image 970
GilLevi Avatar asked Jan 24 '14 17:01

GilLevi


1 Answers

First thing, as commented, increase the number of samples if possible. Also include the variations (like illumination, slight poses etc) you expect to be in the video. However, especially for eigenfaces/ fisherfaces so many images will not help to increase performance. Sadly, the best number of training samples can depend on your data.

The more important point is the hardness of the problem is totally depends on your video. If your video contains variations like illumination, pose; then you can't expect using purely appearance based methods(e.g Eigenfaces) and texture descriptor(LBP) will be succesful. First, you might want to detect faces. Then:

  • You might want to estimate face position and warp to frontal; check for Active Appearance Model and Active Shape Model
  • Use histogram of equalization to attenuate illumination problem
  • Fitting an ellipse to detected face region will help against background noise.

Of course, there are many other methods available in literature; the steps I wrote is implemented in OpenCV and commonly known.

Hope it helps.

like image 162
Semih Korkmaz Avatar answered Oct 06 '22 00:10

Semih Korkmaz