What flow would the program go through?
Facebook recently announced that it would shut down its facial recognition technology, which automatically identifies users in photos and videos, citing growing societal concerns about the use of such technology.
Collected facial images are processed by a commercial face recognizer. In performance evaluations, persons are detected and tracked at distances in the 25–50m range and recognized at distances of 15–20m.
The technology works by identifying unique details in peoples' faces, then comparing that facial data to other faces stored in a database, such as mugshot databases, DMV photos, and even social media. Facial recognition can be used to identify people in stored photos and videos, and in real time.
Very roughly, the processing stages would be:
Step 1 is usually done using the classic Viola&Jones face detection algorithm. It's quite fast and reliable.
The faces found in step 1 may have different brightness, contrast and different sizes. To simplify processing, they are all scaled to the same size and exposure differences are compensated (e.g. using histogram equalization) in step 2.
There are many approaches to step 3. Early face detectors tried to find specific positions (center of the eyes, end of the nose, end of the lips, etc.) and use geometric distances and angles between those as features for recognition. From what I've read, these approaches were very fast, but not that reliable.
A more recent approach, "Eigenfaces", is based on the fact that pictures of faces can be approximated as a linear combination of base images (found through PCA from a large set of training images). The linear factors in this approximation can be used as features. This approach can also be applied to parts of the face (eyes, nose, mouth) individually. It works best if there the pose between all images is the same. If some faces look to the left, others look upwards, it won't work as well. Active appearance models try to counter that effect by training a full 3d model instead of flat 2d pictures.
Step 4 is relatively straightforward: You have a set of numbers for each face, and for the face images acquired during training, and you want to find the training face that's "most similar" to the current test face. That's what machine learning algorithms do. I think the most common algorithm is the support vector machine (SVM). Other choices are e.g. artificial neural networks or k-nearest neighbors. If the features are good, the choice of the ML algorithm won't matter that much.
Literature on the subject:
Principal Component Analysis is at the base of pattern recognition systems such as facial recognition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With