Transforming ARFrame#capturedImage to view size

When using the ARSessionDelegate to process the raw camera image in ARKit...

func session(_ session: ARSession, didUpdate frame: ARFrame) {

    guard let currentFrame = session.currentFrame else { return }
    let capturedImage = currentFrame.capturedImage

    debugPrint("Display size", UIScreen.main.bounds.size)
    debugPrint("Camera frame resolution", CVPixelBufferGetWidth(capturedImage), CVPixelBufferGetHeight(capturedImage))

    // ...

}

... as documented, the camera image data doesn't match the screen size, for example, on iPhone X I get:

Display size: 375x812pt
Camera resolution: 1920x1440px

Now there is the displayTransform(for:viewportSize:) API to transform camera coordinates to view coordinates. When using the API like this:

let ciimage = CIImage(cvImageBuffer: capturedImage)
let transform = currentFrame.displayTransform(for: .portrait, viewportSize: UIScreen.main.bounds.size)
var transformedImage = ciimage.transformed(by: transform)
debugPrint("Transformed size", transformedImage.extent.size)

I get a size of 2340x1920 which seems incorrect, the result should have an aspect ratio of 375:812 (~0.46). What do I miss here / what's the correct way to use this API to transform the camera image to an image "as displayed by ARSCNView"?

(Example project: ARKitCameraImage)

What is ArFrame?

A video image captured as part of a session with position-tracking information.

What is ARSCNView?

The ARSCNView class provides the easiest way to create augmented reality experiences that blend virtual 3D content with a device camera view of the real world. When you run the view's provided ARSession object: The view automatically renders the live video feed from the device camera as the scene background.

This turned out to be quite complicated because displayTransform(for:viewportSize) expects normalized image coordinates, it seems you have to flip the coordinates only in portrait mode and the image needs to be not only transformed but also cropped. The following code does the trick for me. Suggestions how to improve this would be appreciated.

guard let frame = session.currentFrame else { return }
let imageBuffer = frame.capturedImage

let imageSize = CGSize(width: CVPixelBufferGetWidth(imageBuffer), height: CVPixelBufferGetHeight(imageBuffer))
let viewPort = sceneView.bounds
let viewPortSize = sceneView.bounds.size

let interfaceOrientation : UIInterfaceOrientation
if #available(iOS 13.0, *) {
    interfaceOrientation = self.sceneView.window!.windowScene!.interfaceOrientation
} else {
    interfaceOrientation = UIApplication.shared.statusBarOrientation
}

let image = CIImage(cvImageBuffer: imageBuffer)

// The camera image doesn't match the view rotation and aspect ratio
// Transform the image:

// 1) Convert to "normalized image coordinates"
let normalizeTransform = CGAffineTransform(scaleX: 1.0/imageSize.width, y: 1.0/imageSize.height)

// 2) Flip the Y axis (for some mysterious reason this is only necessary in portrait mode)
let flipTransform = (interfaceOrientation.isPortrait) ? CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -1, y: -1) : .identity

// 3) Apply the transformation provided by ARFrame
// This transformation converts:
// - From Normalized image coordinates (Normalized image coordinates range from (0,0) in the upper left corner of the image to (1,1) in the lower right corner)
// - To view coordinates ("a coordinate space appropriate for rendering the camera image onscreen")
// See also: https://developer.apple.com/documentation/arkit/arframe/2923543-displaytransform

let displayTransform = frame.displayTransform(for: interfaceOrientation, viewportSize: viewPortSize)

// 4) Convert to view size
let toViewPortTransform = CGAffineTransform(scaleX: viewPortSize.width, y: viewPortSize.height)

// Transform the image and crop it to the viewport
let transformedImage = image.transformed(by: normalizeTransform.concatenating(flipTransform).concatenating(displayTransform).concatenating(toViewPortTransform)).cropped(to: viewPort)

Thank you so much for your answer! I was working on this for a week.

Here's an alternative way to do it without messing with the orientation. Instead of using the capturedImage property you can use a snapshot of the screen.

func session(_ session: ARSession, didUpdate frame: ARFrame) {
  guard let image = CIImage(image: sceneView.snapshot()) else { return }

  let imageSize = image.extent.size

  // Convert to "normalized image coordinates"
  let resize = CGAffineTransform(scaleX: 1.0 / imageSize.width, y: 1.0 / imageSize.height)

  // Convert to view size
  let viewSize = CGAffineTransform(scaleX: sceneView.bounds.size.width, y: sceneView.bounds.size.height)

  // Transform image
  let editedImage = image.transformed(by: resize.concatenating(viewSize)).cropped(to: sceneView.bounds)

  sceneView.scene.background.contents = context.createCGImage(editedImage, from: editedImage.extent)
 }

Transforming ARFrame#capturedImage to view size

Tags:

ios

arkit

Ralf Ebert

People also ask

2 Answers

Ralf Ebert

Joe

Recent Activity

Donate For Us

Transforming ARFrame#capturedImage to view size

Tags:

ios

arkit

Ralf Ebert

People also ask

2 Answers

Ralf Ebert

Joe

Related questions

Recent Activity

Donate For Us