Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Open CV Face Recognition not accurate

In my app I'm trying to do face recognition on a specific image using Open CV, here first I'm training one image and then after training that image if I run face recognition on that image it successfully recognizes that trained face. However, when I turn to another picture of the same person recognition does not work. It just works on the trained image, so my question is how do I rectify it?

Update: What i want to do is that user should select image of a person from storage and then after training that selected image i want to fetch all images from storage which matches face of my trained image

Here is my activity class:

public class MainActivity extends AppCompatActivity {
    private Mat rgba,gray;
    private CascadeClassifier classifier;
    private MatOfRect faces;
    private ArrayList<Mat> images;
    private ArrayList<String> imagesLabels;
    private Storage local;
    ImageView mimage;
    Button prev,next;
    ArrayList<Integer> imgs;
    private int label[] = new int[1];
    private double predict[] = new double[1];
    Integer pos = 0;
    private String[] uniqueLabels;
    FaceRecognizer recognize;
    private boolean trainfaces() {
        if(images.isEmpty())
            return false;
        List<Mat> imagesMatrix = new ArrayList<>();
        for (int i = 0; i < images.size(); i++)
            imagesMatrix.add(images.get(i));
        Set<String> uniqueLabelsSet = new HashSet<>(imagesLabels); // Get all unique labels
        uniqueLabels = uniqueLabelsSet.toArray(new String[uniqueLabelsSet.size()]); // Convert to String array, so we can read the values from the indices

        int[] classesNumbers = new int[uniqueLabels.length];
        for (int i = 0; i < classesNumbers.length; i++)
            classesNumbers[i] = i + 1; // Create incrementing list for each unique label starting at 1
        int[] classes = new int[imagesLabels.size()];
        for (int i = 0; i < imagesLabels.size(); i++) {
            String label = imagesLabels.get(i);
            for (int j = 0; j < uniqueLabels.length; j++) {
                if (label.equals(uniqueLabels[j])) {
                    classes[i] = classesNumbers[j]; // Insert corresponding number
                    break;
                }
            }
        }
        Mat vectorClasses = new Mat(classes.length, 1, CvType.CV_32SC1); // CV_32S == int
        vectorClasses.put(0, 0, classes); // Copy int array into a vector

        recognize = LBPHFaceRecognizer.create(3,8,8,8,200);
        recognize.train(imagesMatrix, vectorClasses);
        if(SaveImage())
            return true;

        return false;
    }
    public void cropedImages(Mat mat) {
        Rect rect_Crop=null;
        for(Rect face: faces.toArray()) {
            rect_Crop = new Rect(face.x, face.y, face.width, face.height);
        }
        Mat croped = new Mat(mat, rect_Crop);
        images.add(croped);
    }
    public boolean SaveImage() {
        File path = new File(Environment.getExternalStorageDirectory(), "TrainedData");
        path.mkdirs();
        String filename = "lbph_trained_data.xml";
        File file = new File(path, filename);
        recognize.save(file.toString());
        if(file.exists())
            return true;
        return false;
    }

    private BaseLoaderCallback callbackLoader = new BaseLoaderCallback(this) {
        @Override
        public void onManagerConnected(int status) {
            switch(status) {
                case BaseLoaderCallback.SUCCESS:
                    faces = new MatOfRect();

                    //reset
                    images = new ArrayList<Mat>();
                    imagesLabels = new ArrayList<String>();
                    local.putListMat("images", images);
                    local.putListString("imagesLabels", imagesLabels);

                    images = local.getListMat("images");
                    imagesLabels = local.getListString("imagesLabels");

                    break;
                default:
                    super.onManagerConnected(status);
                    break;
            }
        }
    };

    @Override
    protected void onResume() {
        super.onResume();
        if(OpenCVLoader.initDebug()) {
            Log.i("hmm", "System Library Loaded Successfully");
            callbackLoader.onManagerConnected(BaseLoaderCallback.SUCCESS);
        } else {
            Log.i("hmm", "Unable To Load System Library");
            OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION, this, callbackLoader);
        }
    }

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        prev = findViewById(R.id.btprev);
        next = findViewById(R.id.btnext);
        mimage = findViewById(R.id.mimage);
       local = new Storage(this);
       imgs = new ArrayList();
       imgs.add(R.drawable.jonc);
       imgs.add(R.drawable.jonc2);
       imgs.add(R.drawable.randy1);
       imgs.add(R.drawable.randy2);
       imgs.add(R.drawable.imgone);
       imgs.add(R.drawable.imagetwo);
       mimage.setBackgroundResource(imgs.get(pos));
        prev.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                if(pos!=0){
                  pos--;
                  mimage.setBackgroundResource(imgs.get(pos));
                }
            }
        });
        next.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                if(pos<5){
                    pos++;
                    mimage.setBackgroundResource(imgs.get(pos));
                }
            }
        });
        Button train = (Button)findViewById(R.id.btn_train);
        train.setOnClickListener(new View.OnClickListener() {
            @RequiresApi(api = Build.VERSION_CODES.KITKAT)
            @Override
            public void onClick(View view) {
                rgba = new Mat();
                gray = new Mat();
                Mat mGrayTmp = new Mat();
                Mat mRgbaTmp = new Mat();
                classifier = FileUtils.loadXMLS(MainActivity.this);
                Bitmap icon = BitmapFactory.decodeResource(getResources(),
                        imgs.get(pos));
                Bitmap bmp32 = icon.copy(Bitmap.Config.ARGB_8888, true);
                Utils.bitmapToMat(bmp32, mGrayTmp);
                Utils.bitmapToMat(bmp32, mRgbaTmp);
                Imgproc.cvtColor(mGrayTmp, mGrayTmp, Imgproc.COLOR_BGR2GRAY);
                Imgproc.cvtColor(mRgbaTmp, mRgbaTmp, Imgproc.COLOR_BGRA2RGBA);
                /*Core.transpose(mGrayTmp, mGrayTmp); // Rotate image
                Core.flip(mGrayTmp, mGrayTmp, -1); // Flip along both*/
                gray = mGrayTmp;
                rgba = mRgbaTmp;
                Imgproc.resize(gray, gray, new Size(200,200.0f/ ((float)gray.width()/ (float)gray.height())));
                if(gray.total() == 0)
                    Toast.makeText(getApplicationContext(), "Can't Detect Faces", Toast.LENGTH_SHORT).show();
                classifier.detectMultiScale(gray,faces,1.1,3,0|CASCADE_SCALE_IMAGE, new Size(30,30));
                if(!faces.empty()) {
                    if(faces.toArray().length > 1)
                        Toast.makeText(getApplicationContext(), "Mutliple Faces Are not allowed", Toast.LENGTH_SHORT).show();
                    else {
                        if(gray.total() == 0) {
                            Log.i("hmm", "Empty gray image");
                            return;
                        }
                        cropedImages(gray);
                        imagesLabels.add("Baby");
                        Toast.makeText(getApplicationContext(), "Picture Set As Baby", Toast.LENGTH_LONG).show();
                        if (images != null && imagesLabels != null) {
                            local.putListMat("images", images);
                            local.putListString("imagesLabels", imagesLabels);
                            Log.i("hmm", "Images have been saved");
                            if(trainfaces()) {
                                images.clear();
                                imagesLabels.clear();
                            }
                        }
                    }
                }else {
                   /* Bitmap bmp = null;
                    Mat tmp = new Mat(250, 250, CvType.CV_8U, new Scalar(4));
                    try {
                        //Imgproc.cvtColor(seedsImage, tmp, Imgproc.COLOR_RGB2BGRA);
                        Imgproc.cvtColor(gray, tmp, Imgproc.COLOR_GRAY2RGBA, 4);
                        bmp = Bitmap.createBitmap(tmp.cols(), tmp.rows(), Bitmap.Config.ARGB_8888);
                        Utils.matToBitmap(tmp, bmp);
                    } catch (CvException e) {
                        Log.d("Exception", e.getMessage());
                    }*/
                    /*    mimage.setImageBitmap(bmp);*/
                    Toast.makeText(getApplicationContext(), "Unknown Face", Toast.LENGTH_SHORT).show();
                }
            }
        });
        Button recognize = (Button)findViewById(R.id.btn_recognize);
        recognize.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                if(loadData())
                    Log.i("hmm", "Trained data loaded successfully");
                rgba = new Mat();
                gray = new Mat();
                faces = new MatOfRect();
                Mat mGrayTmp = new Mat();
                Mat mRgbaTmp = new Mat();
                classifier = FileUtils.loadXMLS(MainActivity.this);
                Bitmap icon = BitmapFactory.decodeResource(getResources(),
                        imgs.get(pos));
                Bitmap bmp32 = icon.copy(Bitmap.Config.ARGB_8888, true);
                Utils.bitmapToMat(bmp32, mGrayTmp);
                Utils.bitmapToMat(bmp32, mRgbaTmp);
                Imgproc.cvtColor(mGrayTmp, mGrayTmp, Imgproc.COLOR_BGR2GRAY);
                Imgproc.cvtColor(mRgbaTmp, mRgbaTmp, Imgproc.COLOR_BGRA2RGBA);
                /*Core.transpose(mGrayTmp, mGrayTmp); // Rotate image
                Core.flip(mGrayTmp, mGrayTmp, -1); // Flip along both*/
                gray = mGrayTmp;
                rgba = mRgbaTmp;
                Imgproc.resize(gray, gray, new Size(200,200.0f/ ((float)gray.width()/ (float)gray.height())));
                if(gray.total() == 0)
                    Toast.makeText(getApplicationContext(), "Can't Detect Faces", Toast.LENGTH_SHORT).show();
                classifier.detectMultiScale(gray,faces,1.1,3,0|CASCADE_SCALE_IMAGE, new Size(30,30));
                if(!faces.empty()) {
                    if(faces.toArray().length > 1)
                        Toast.makeText(getApplicationContext(), "Mutliple Faces Are not allowed", Toast.LENGTH_SHORT).show();
                    else {
                        if(gray.total() == 0) {
                            Log.i("hmm", "Empty gray image");
                            return;
                        }
                        recognizeImage(gray);
                    }
                }else {
                    Toast.makeText(getApplicationContext(), "Unknown Face", Toast.LENGTH_SHORT).show();
                }
            }
        });


    }
    private void recognizeImage(Mat mat) {
        Rect rect_Crop=null;
        for(Rect face: faces.toArray()) {
            rect_Crop = new Rect(face.x, face.y, face.width, face.height);
        }
        Mat croped = new Mat(mat, rect_Crop);
        recognize.predict(croped, label, predict);
        int indice = (int)predict[0];
        Log.i("hmmcheck:",String.valueOf(label[0])+" : "+String.valueOf(indice));
        if(label[0] != -1 && indice < 125)
            Toast.makeText(getApplicationContext(), "Welcome "+uniqueLabels[label[0]-1]+"", Toast.LENGTH_SHORT).show();
        else
            Toast.makeText(getApplicationContext(), "You're not the right person", Toast.LENGTH_SHORT).show();
    }
    private boolean loadData() {
        String filename = FileUtils.loadTrained();
        if(filename.isEmpty())
            return false;
        else
        {
            recognize.read(filename);
            return true;
        }
    }
}

My File Utils Class:

   public class FileUtils {
        private static String TAG = FileUtils.class.getSimpleName();
        private static boolean loadFile(Context context, String cascadeName) {
            InputStream inp = null;
            OutputStream out = null;
            boolean completed = false;
            try {
                inp = context.getResources().getAssets().open(cascadeName);
                File outFile = new File(context.getCacheDir(), cascadeName);
                out = new FileOutputStream(outFile);

                byte[] buffer = new byte[4096];
                int bytesread;
                while((bytesread = inp.read(buffer)) != -1) {
                    out.write(buffer, 0, bytesread);
                }

                completed = true;
                inp.close();
                out.flush();
                out.close();
            } catch (IOException e) {
                Log.i(TAG, "Unable to load cascade file" + e);
            }
            return completed;
        }
        public static CascadeClassifier loadXMLS(Activity activity) {


            InputStream is = activity.getResources().openRawResource(R.raw.lbpcascade_frontalface);
            File cascadeDir = activity.getDir("cascade", Context.MODE_PRIVATE);
            File mCascadeFile = new File(cascadeDir, "lbpcascade_frontalface_improved.xml");
            FileOutputStream os = null;
            try {
                os = new FileOutputStream(mCascadeFile);
                byte[] buffer = new byte[4096];
                int bytesRead;
                while ((bytesRead = is.read(buffer)) != -1) {
                    os.write(buffer, 0, bytesRead);
                }
                is.close();
                os.close();

            } catch (FileNotFoundException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }


            return new CascadeClassifier(mCascadeFile.getAbsolutePath());
        }
        public static String loadTrained() {
            File file = new File(Environment.getExternalStorageDirectory(), "TrainedData/lbph_trained_data.xml");

            return file.toString();
        }
    }

These are the images i'm trying to compare here face of person is same still in recognition it's not matching! Image 1 Image 2

like image 815
R.Coder Avatar asked Nov 14 '19 08:11

R.Coder


2 Answers

Update

According to the new edit in the question, you need a way to identify new people on the fly whose photos might not have been available during the training phase of the model. These tasks are called few shot learning. This is similar to the requirements of the intelligence/police agencies to find their targets using CCTV camera footage. As usually there are not enough images of a specific target, during training, they use models such as FaceNet. I really suggest reading the paper, however, I explain a few of its highlights here:

  • Generally, the last layer of a classifier is a n*1 vector with n-1 of the elements almost equal to zero, and one close to 1. The element close to 1, determines the prediction of the classifier about the input's label. Typical CNN architecture
  • The authors figured out that if they train a classifier network with a specific loss function on a huge dataset of faces, you can use the semi-final layer output as a representation of any face, irrespective of it being in the training set or not, the authors call this vector Face Embedding.
  • The previous result means that with a very well trained FaceNet model, you can summarise any face into a vector. The very interesting attribute of this approach is that the vectors of a specific person's face in different angles/positions/states have are proximate in the euclidian space (this property is enforced by the loss function that the authors chose).enter image description here
  • In summary, you have a model that gets faces as input and returns vectors. The vectors close to each other are very likely to belong to the same person (For checking that you can use KNN or just simple euclidian distance).

One implementation of FaceNet can be found here. I suggest you try to run it on your computer to get to know what you are actually dealing with. After that, it might be best to do the following:

  1. Transform the FaceNet model mentioned in the repository to its tflite version (this blogpost might help)
  2. For each photo submitted by the user, use Face API to extract the face(s)
  3. Use the minified model in your app to get the face embeddings of the extracted face.
  4. Process all the images in the gallery of the user, getting the vectors for the faces in the photos.
  5. Then compare each vector found in step4 with each vector found in step3 to get the matches.

Original Answer

You came across one of the most prevalent challenges of machine learning: Overfitting. Face detection and recognition is a huge area of research on its own and almost all the reasonably accurate models are using some kind of deep learning. Note that even detecting a face accurately is not as easy as it seems, however, as you are doing it on android, you can use Face API for this task. (Other more advanced techniques such as MTCNN are too slow/difficult to deploy on a handset). It has been shown that just feeding the model with a face photo with a lot of background noise or multiple people inside does not work. So, you really cannot skip this step.

After getting a nice trimmed face of the candidate targets from the background, you need to overcome the challenge of recognising the detected faces. Again, all the competent models to the best of my knowledge, are using some sort of deep learning/convolutional neural networks. Using them on a mobile phone is a challenge, but thanks to Tensorflow Lite you can minify them and run them within your app. A project about face recognition on android phones that I had worked on is here that you can check. Keep in mind that any good model should be trained on numerous instances of labelled data, however there are a plethora of models already trained on large datasets of faces or other image recognition tasks, to tweak them and use their existing knowledge, we can employ transfer learning, for a quick start on object detection and transfer learning that is closely related to your case check this blog post.

Overall, you have to get numerous instances of the faces that you want to detect plus numerous face pics of people that you don't care about, then you need to train a model based on the above-mentioned resources, and then you need to use TensorFlow lite to decrease its size and embed it within your app. For each frame then, you call android Face API and feed (the probably detected face) into the model and identify the person.

Depending on your level of tolerance for delay and the number of training set size and number of targets, you can get various results, however, %90+ accuracy is easily achievable if you have only a few target people.

like image 81
Farzad Vertigo Avatar answered Nov 03 '22 00:11

Farzad Vertigo


If I understand correctly, you're training the classifier with a single image. In that case, this one specific image is everything the classifier will be able to ever recognise. You would need a noticeably bigger training set of pictures showing the same person, something like 5 or 10 different images at the very least.

like image 37
Florian Echtler Avatar answered Nov 03 '22 01:11

Florian Echtler