How can I use the depth data captured using iPhone true-depth
Camera to distinguish between a real human 3D face and a photograph of the same?
The requirement is to use it for authentication.
What I did: Created a sample app to get a continuous stream of AVDepthData
of what is in front of the camera.
Many people know that Apple's Face ID system is more secure than the default Android facial recognition program. For example, Face ID can't be fooled by a photograph.
The TrueDepth camera works by projecting thousands of invisible dots onto your face and analysing them to create a depth map of your face. The camera will also capture an infrared image of your face to help in providing accurate face data.
The probability that a random person in the population could look at your iPhone or iPad Pro and unlock it using Face ID is less than 1 in 1,000,000 with a single enrolled appearance whether or not you're wearing a mask.
In iOS, the operating system of Apple's smartphones and tablets, TrueDepth is mainly used for 3D face authentication and recognition, while LiDAR enables new features for Augmented Reality by accelerating plane detection. To scan objects, it is therefore necessary to install an additional application.
TrueDepth sensor lets iPhone X / 11 / 12 / 13 generate a high quality ZDepth channel in addition to RGB channels that are captured through a regular selfie camera. ZDepth channel allows us visually make a difference whether it's a real human face or just a photo. In ZDepth channel, a human face is represented as a gradient, but photo has almost solid color because all pixels on a photo's plane are equidistant from camera.
At the moment AVFoundation API has no Bool-type instance property allowing you to find out if it's a real face or a photo, but AVFoundation's capture subsystem provides you with AVDepthData class – a container for per-pixel distance data (depth map) captured by camera device. A depth map describes at each pixel the distance to an object, in meters.
@available(iOS 11.0, *)
open class AVDepthData: NSObject {
open var depthDataType: OSType { get }
open var depthDataMap: CVPixelBuffer { get }
open var isDepthDataFiltered: Bool { get }
open var depthDataAccuracy: AVDepthDataAccuracy { get }
}
A pixel buffer is capable of containing the depth data's per-pixel depth or disparity map.
var depthDataMap: CVPixelBuffer { get }
ARKit heart is beating thanks to AVFoundation and CoreMotion sessions (in a certain extent it also uses Vision). Of course you can use this framework for Human Face detection but remember that ARKit is a computationally intensive module due to its "heavy metal" tracking subsystem. For a successful real face (not a photo) detection, use ARFaceAnchor allowing you to register head's motion and orientation at 60 fps and facial blendshapes allowing you to register user's facial expressions in real time.
Implement Apple Vision and CoreML techniques to recognize and classify a human face contained in CVPixelBuffer. But remember, you need ZDepth-to-RGB conversion in order to work with Apple Vision – AI / ML mobile frameworks don't work with Depth map data directly, at the moment. When you want to use RGBD data for authentication, and there will be just one or two users' faces to recognize, it considerably simplifies a task for Model Learning process. All you have to do is to create an mlmodel
for Vision containing many variations of ZDepth facial images.
You can use Apple Create ML app for generating a lightweight and effective mlmodel
files.
Sample codes for detecting and classifying images using Vision you can find here and here. Also you can read this post to find out how to convert AVDepthData to regular RGB pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With