When using the ARSessionDelegate
to process the raw camera image in ARKit...
func session(_ session: ARSession, didUpdate frame: ARFrame) {
guard let currentFrame = session.currentFrame else { return }
let capturedImage = currentFrame.capturedImage
debugPrint("Display size", UIScreen.main.bounds.size)
debugPrint("Camera frame resolution", CVPixelBufferGetWidth(capturedImage), CVPixelBufferGetHeight(capturedImage))
// ...
}
... as documented, the camera image data doesn't match the screen size, for example, on iPhone X I get:
Now there is the displayTransform(for:viewportSize:) API to transform camera coordinates to view coordinates. When using the API like this:
let ciimage = CIImage(cvImageBuffer: capturedImage)
let transform = currentFrame.displayTransform(for: .portrait, viewportSize: UIScreen.main.bounds.size)
var transformedImage = ciimage.transformed(by: transform)
debugPrint("Transformed size", transformedImage.extent.size)
I get a size of 2340x1920 which seems incorrect, the result should have an aspect ratio of 375:812 (~0.46). What do I miss here / what's the correct way to use this API to transform the camera image to an image "as displayed by ARSCNView"?
(Example project: ARKitCameraImage)
A video image captured as part of a session with position-tracking information.
The ARSCNView class provides the easiest way to create augmented reality experiences that blend virtual 3D content with a device camera view of the real world. When you run the view's provided ARSession object: The view automatically renders the live video feed from the device camera as the scene background.
This turned out to be quite complicated because displayTransform(for:viewportSize)
expects normalized image coordinates, it seems you have to flip the coordinates only in portrait mode and the image needs to be not only transformed but also cropped. The following code does the trick for me. Suggestions how to improve this would be appreciated.
guard let frame = session.currentFrame else { return }
let imageBuffer = frame.capturedImage
let imageSize = CGSize(width: CVPixelBufferGetWidth(imageBuffer), height: CVPixelBufferGetHeight(imageBuffer))
let viewPort = sceneView.bounds
let viewPortSize = sceneView.bounds.size
let interfaceOrientation : UIInterfaceOrientation
if #available(iOS 13.0, *) {
interfaceOrientation = self.sceneView.window!.windowScene!.interfaceOrientation
} else {
interfaceOrientation = UIApplication.shared.statusBarOrientation
}
let image = CIImage(cvImageBuffer: imageBuffer)
// The camera image doesn't match the view rotation and aspect ratio
// Transform the image:
// 1) Convert to "normalized image coordinates"
let normalizeTransform = CGAffineTransform(scaleX: 1.0/imageSize.width, y: 1.0/imageSize.height)
// 2) Flip the Y axis (for some mysterious reason this is only necessary in portrait mode)
let flipTransform = (interfaceOrientation.isPortrait) ? CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -1, y: -1) : .identity
// 3) Apply the transformation provided by ARFrame
// This transformation converts:
// - From Normalized image coordinates (Normalized image coordinates range from (0,0) in the upper left corner of the image to (1,1) in the lower right corner)
// - To view coordinates ("a coordinate space appropriate for rendering the camera image onscreen")
// See also: https://developer.apple.com/documentation/arkit/arframe/2923543-displaytransform
let displayTransform = frame.displayTransform(for: interfaceOrientation, viewportSize: viewPortSize)
// 4) Convert to view size
let toViewPortTransform = CGAffineTransform(scaleX: viewPortSize.width, y: viewPortSize.height)
// Transform the image and crop it to the viewport
let transformedImage = image.transformed(by: normalizeTransform.concatenating(flipTransform).concatenating(displayTransform).concatenating(toViewPortTransform)).cropped(to: viewPort)
Thank you so much for your answer! I was working on this for a week.
Here's an alternative way to do it without messing with the orientation. Instead of using the capturedImage property you can use a snapshot of the screen.
func session(_ session: ARSession, didUpdate frame: ARFrame) {
guard let image = CIImage(image: sceneView.snapshot()) else { return }
let imageSize = image.extent.size
// Convert to "normalized image coordinates"
let resize = CGAffineTransform(scaleX: 1.0 / imageSize.width, y: 1.0 / imageSize.height)
// Convert to view size
let viewSize = CGAffineTransform(scaleX: sceneView.bounds.size.width, y: sceneView.bounds.size.height)
// Transform image
let editedImage = image.transformed(by: resize.concatenating(viewSize)).cropped(to: sceneView.bounds)
sceneView.scene.background.contents = context.createCGImage(editedImage, from: editedImage.extent)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With