Please explain to me, in few words, how the Viola-Jones face detection method works.

The Viola-Jones detector is a strong, binary classifier build of several weak detectors <pre class="prettyprint"><code>Each weak detector is an extremely simple binary classifier </code></pre> During the learning stage, a cascade of weak detectors is trained so as to gain the desired hit rate / miss rate (or precision / recall) using Adaboost To detect objects, the original image is partitioned in several rectangular patches, each of which is submitted to the cascade If a rectangular image patch passes through all of the cascade stages, then it is classified as “positive” The process is repeated at different scales <img src="https://i.stack.imgur.com/K1TpT.png" alt="enter image description here"> <blockquote> Actually, at a low level, the basic component of an object detector is just something required to say if a certain sub-region of the original image contains an istance of the object of interest or not. That is what a binary classifier does. </blockquote> The basic, weak classifier is based on a very simple visual feature (those kind of features are often referred to as “Haar-like features”) <img src="https://i.stack.imgur.com/ssXnC.png" alt="enter image description here"> Haar-like features consist of a class of local features that are calculated by subtracting the sum of a subregion of the feature from the sum of the remaining region of the feature. <img src="https://i.stack.imgur.com/dIaGt.png" alt="enter image description here"> These feature are characterised by the fact that they are easy to calculate and with the use of an integral image, very efficient to calculate. Lienhart introduced an extended set of twisted Haar-like feature (see image) <img src="https://i.stack.imgur.com/r4K9G.png" alt="enter image description here"> These are the standard Haar-like feature that have been twisted by 45 degrees. Lienhart did not originally make use of the twisted checker board Haar-like feature (x2y2) since the diagonal elements that they represent can be simply represented using twisted features, however it is clear that a twisted version of this feature can also be implemented and used. These twisted Haar-like features can also be fast and efficiently calculated using an integral image that has been twisted 45 degrees. The only implementation issue is that the twisted features must be rounded to integer values so that they are aligned with pixel boundaries. This process is similar to the rounding used when scaling a Haar-like feature for larger or smaller windows, however one difference is that for a 45 degrees twisted feature, the integer number of pixels used for the height and width of the feature mean that the diagonal coordinates of the pixel will be always on the same diagonal set of pixels <img src="https://i.stack.imgur.com/HBTY8.png" alt="enter image description here"> This means that the number of different sized 45 degrees twisted features available is significantly reduced as compared to the standard vertically and horizontally aligned features. So we have something like: <img src="https://i.stack.imgur.com/nPGmL.png" alt="enter image description here"> About the formula, the Fast computation of Haar-like features using integral images looks like: <img src="https://i.stack.imgur.com/V7x9c.png" alt="enter image description here"> Finally, here is a c++ implementation which uses ViolaJones.h by Ivan Kusalic to see the complete c++ project go here

The Viola-Jones detector is a strong binary classifier build of several weak detectors. Each weak detector is an extremely simple binary classifier The detection consists of below parts: <code>Haar Filter</code>: extract features from image to calssify(features act to encode ad-hoc domain knowledge) <code>Integral Image</code>: allows for very fast feature evaluation <code>Cascade Classifier</code>: A cascade classifier consists of multiple stages of filters, to classify a image( sliding window of a image) is a face. Below is an overview of how to detect a face in image. <img src="https://i.stack.imgur.com/80f6l.png" alt="enter image description here"> <blockquote> A detection window shifts around the whole image extract feature(by <code>haar filter</code> computed by <code>Integral Image</code> then send the extracted feature to <code>Cascade Classifier</code> to classify if it is a face). The sliding window shifts pixel-by-pixel. Each time the window shifts, the image region within the window will go through the cascade classifier. </blockquote> <code>Haar Filter</code>: You can understand the the filter can extract features like <code>eyes</code>, <code>bridge of the nose</code> and so on. <img src="https://i.stack.imgur.com/4mwRX.png" alt="enter image description here"> <code>Integral Image</code>: allows for very fast feature evaluation <img src="https://i.stack.imgur.com/tUNKt.png" alt="enter image description here"> <code>Cascade Classifier</code>: <blockquote> A cascade classifier consists of multiple stages of filters, as shown in the figure below. Each time the sliding window shifts, the new region within the sliding window will go through the cascade classifier stage-by-stage. If the input region fails to pass the threshold of a stage, the cascade classifier will immediately reject the region as a face. If a region pass all stages successfully, it will be classified as a candidate of face, which may be refined by further processing. </blockquote> <img src="https://i.stack.imgur.com/Kn0lE.png" alt="enter image description here"> For more details: Firstly, I suggest you to read the source paper Rapid Object Detection using a Boosted Cascade of Simple Features to have a overview understanding of the method. If you can't understand it clearly, you can see Viola-Jones Face Detection or Implementing the Viola-Jones Face Detection Algorithm or Study of Viola-Jones Real Time Face Detector for more details. Here is a python code Python implementation of the face detection algorithm by Paul Viola and Michael J. Jones. matlab code here .

How does the Viola-Jones face detection method work?

2 Answers

The Viola-Jones detector is a strong, binary classifier build of several weak detectors

Each weak detector is an extremely simple binary classifier

During the learning stage, a cascade of weak detectors is trained so as to gain the desired hit rate / miss rate (or precision / recall) using Adaboost To detect objects, the original image is partitioned in several rectangular patches, each of which is submitted to the cascade

If a rectangular image patch passes through all of the cascade stages, then it is classified as “positive” The process is repeated at different scales

enter image description here

Actually, at a low level, the basic component of an object detector is just something required to say if a certain sub-region of the original image contains an istance of the object of interest or not. That is what a binary classifier does.

The basic, weak classifier is based on a very simple visual feature (those kind of features are often referred to as “Haar-like features”)
enter image description here

Haar-like features consist of a class of local features that are calculated by subtracting the sum of a subregion of the feature from the sum of the remaining region of the feature.

enter image description here
These feature are characterised by the fact that they are easy to calculate and with the use of an integral image, very efficient to calculate.

Lienhart introduced an extended set of twisted Haar-like feature (see image)

enter image description here
These are the standard Haar-like feature that have been twisted by 45 degrees. Lienhart did not originally make use of the twisted checker board Haar-like feature (x2y2) since the diagonal elements that they represent can be simply represented using twisted features, however it is clear that a twisted version of this feature can also be implemented and used.

These twisted Haar-like features can also be fast and efficiently calculated using an integral image that has been twisted 45 degrees. The only implementation issue is that the twisted features must be rounded to integer values so that they are aligned with pixel boundaries. This process is similar to the rounding used when scaling a Haar-like feature for larger or smaller windows, however one difference is that for a 45 degrees twisted feature, the integer number of pixels used for the height and width of the feature mean that the diagonal coordinates of the pixel will be always on the same diagonal set of pixels

enter image description here
This means that the number of different sized 45 degrees twisted features available is significantly reduced as compared to the standard vertically and horizontally aligned features.

So we have something like: enter image description here

About the formula, the Fast computation of Haar-like features using integral images looks like:

enter image description here

Finally, here is a c++ implementation which uses ViolaJones.h by Ivan Kusalic

to see the complete c++ project go here

answered Sep 28 '22 09:09

edgarmtze

The Viola-Jones detector is a strong binary classifier build of several weak detectors. Each weak detector is an extremely simple binary classifier

The detection consists of below parts:

Haar Filter: extract features from image to calssify(features act to encode ad-hoc domain knowledge)

Integral Image: allows for very fast feature evaluation

Cascade Classifier: A cascade classifier consists of multiple stages of filters, to classify a image( sliding window of a image) is a face.

Below is an overview of how to detect a face in image.

enter image description here

A detection window shifts around the whole image extract feature(by haar filter computed by Integral Image then send the extracted feature to Cascade Classifier to classify if it is a face). The sliding window shifts pixel-by-pixel. Each time the window shifts, the image region within the window will go through the cascade classifier.

Haar Filter: You can understand the the filter can extract features like eyes, bridge of the nose and so on.

enter image description here

Integral Image: allows for very fast feature evaluation

enter image description here

Cascade Classifier:

A cascade classifier consists of multiple stages of filters, as shown in the figure below. Each time the sliding window shifts, the new region within the sliding window will go through the cascade classifier stage-by-stage. If the input region fails to pass the threshold of a stage, the cascade classifier will immediately reject the region as a face. If a region pass all stages successfully, it will be classified as a candidate of face, which may be refined by further processing.

enter image description here

For more details:

Firstly, I suggest you to read the source paper Rapid Object Detection using a Boosted Cascade of Simple Features to have a overview understanding of the method.

If you can't understand it clearly, you can see Viola-Jones Face Detection or Implementing the Viola-Jones Face Detection Algorithm or Study of Viola-Jones Real Time Face Detector for more details.

Here is a python code Python implementation of the face detection algorithm by Paul Viola and Michael J. Jones.

matlab code here .

answered Sep 28 '22 08:09

Jayhello

Related questions
                            
                                Cron job not working in Whenever gem
                            
                                Drupal 7 Get image field path
                            
                                Get href attribute on jQuery
                            
                                Python math module
                            
                                a function to check if the nth bit is set in a byte
                            
                                Scheduling a task in Windows Server 2008 R2
                            
                                How can I get SecKeyRef from DER/PEM file
                            
                                How to create a password field in xcode
                            
                                Start MongoDB from within a Grunt task
                            
                                java - duplicate class
                            
                                Twitter Bootstrap - how to detect when media queries starts
                            
                                Capitalize first letter of each word with liquid syntax?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does the Viola-Jones face detection method work?

Tags:

BlackShadow

People also ask

2 Answers

edgarmtze

Jayhello

Recent Activity

Donate For Us