Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transforming ARFrame#capturedImage to view size

Tags:

ios

arkit

When using the ARSessionDelegate to process the raw camera image in ARKit...

func session(_ session: ARSession, didUpdate frame: ARFrame) {

    guard let currentFrame = session.currentFrame else { return }
    let capturedImage = currentFrame.capturedImage

    debugPrint("Display size", UIScreen.main.bounds.size)
    debugPrint("Camera frame resolution", CVPixelBufferGetWidth(capturedImage), CVPixelBufferGetHeight(capturedImage))

    // ...

}

... as documented, the camera image data doesn't match the screen size, for example, on iPhone X I get:

  • Display size: 375x812pt
  • Camera resolution: 1920x1440px

Now there is the displayTransform(for:viewportSize:) API to transform camera coordinates to view coordinates. When using the API like this:

let ciimage = CIImage(cvImageBuffer: capturedImage)
let transform = currentFrame.displayTransform(for: .portrait, viewportSize: UIScreen.main.bounds.size)
var transformedImage = ciimage.transformed(by: transform)
debugPrint("Transformed size", transformedImage.extent.size)

I get a size of 2340x1920 which seems incorrect, the result should have an aspect ratio of 375:812 (~0.46). What do I miss here / what's the correct way to use this API to transform the camera image to an image "as displayed by ARSCNView"?

(Example project: ARKitCameraImage)

like image 913
Ralf Ebert Avatar asked Nov 11 '19 21:11

Ralf Ebert


People also ask

What is ArFrame?

A video image captured as part of a session with position-tracking information.

What is ARSCNView?

The ARSCNView class provides the easiest way to create augmented reality experiences that blend virtual 3D content with a device camera view of the real world. When you run the view's provided ARSession object: The view automatically renders the live video feed from the device camera as the scene background.


2 Answers

This turned out to be quite complicated because displayTransform(for:viewportSize) expects normalized image coordinates, it seems you have to flip the coordinates only in portrait mode and the image needs to be not only transformed but also cropped. The following code does the trick for me. Suggestions how to improve this would be appreciated.

guard let frame = session.currentFrame else { return }
let imageBuffer = frame.capturedImage

let imageSize = CGSize(width: CVPixelBufferGetWidth(imageBuffer), height: CVPixelBufferGetHeight(imageBuffer))
let viewPort = sceneView.bounds
let viewPortSize = sceneView.bounds.size

let interfaceOrientation : UIInterfaceOrientation
if #available(iOS 13.0, *) {
    interfaceOrientation = self.sceneView.window!.windowScene!.interfaceOrientation
} else {
    interfaceOrientation = UIApplication.shared.statusBarOrientation
}

let image = CIImage(cvImageBuffer: imageBuffer)

// The camera image doesn't match the view rotation and aspect ratio
// Transform the image:

// 1) Convert to "normalized image coordinates"
let normalizeTransform = CGAffineTransform(scaleX: 1.0/imageSize.width, y: 1.0/imageSize.height)

// 2) Flip the Y axis (for some mysterious reason this is only necessary in portrait mode)
let flipTransform = (interfaceOrientation.isPortrait) ? CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -1, y: -1) : .identity

// 3) Apply the transformation provided by ARFrame
// This transformation converts:
// - From Normalized image coordinates (Normalized image coordinates range from (0,0) in the upper left corner of the image to (1,1) in the lower right corner)
// - To view coordinates ("a coordinate space appropriate for rendering the camera image onscreen")
// See also: https://developer.apple.com/documentation/arkit/arframe/2923543-displaytransform

let displayTransform = frame.displayTransform(for: interfaceOrientation, viewportSize: viewPortSize)

// 4) Convert to view size
let toViewPortTransform = CGAffineTransform(scaleX: viewPortSize.width, y: viewPortSize.height)

// Transform the image and crop it to the viewport
let transformedImage = image.transformed(by: normalizeTransform.concatenating(flipTransform).concatenating(displayTransform).concatenating(toViewPortTransform)).cropped(to: viewPort)
like image 110
Ralf Ebert Avatar answered Oct 11 '22 06:10

Ralf Ebert


Thank you so much for your answer! I was working on this for a week.

Here's an alternative way to do it without messing with the orientation. Instead of using the capturedImage property you can use a snapshot of the screen.

func session(_ session: ARSession, didUpdate frame: ARFrame) {
  guard let image = CIImage(image: sceneView.snapshot()) else { return }

  let imageSize = image.extent.size

  // Convert to "normalized image coordinates"
  let resize = CGAffineTransform(scaleX: 1.0 / imageSize.width, y: 1.0 / imageSize.height)

  // Convert to view size
  let viewSize = CGAffineTransform(scaleX: sceneView.bounds.size.width, y: sceneView.bounds.size.height)

  // Transform image
  let editedImage = image.transformed(by: resize.concatenating(viewSize)).cropped(to: sceneView.bounds)

  sceneView.scene.background.contents = context.createCGImage(editedImage, from: editedImage.extent)
 }
like image 45
Joe Avatar answered Oct 11 '22 06:10

Joe