I am using ARKit (with SceneKit) to add the virtual object (e.g. ball). I am tracking real world object (e.g. foot) by using Vision framework and receiving its updated position in vision request completion handler method.
let request = VNTrackObjectRequest(detectedObjectObservation: lastObservation, completionHandler: self.handleVisionRequestUpdate)
I wants to replace the tracked real world object with virtual (for example replace foot with cube) but I am not sure how to replace the boundingBox rect (which we receive in vision request completion) into scene kit node as coordinate system are different.
below is the code of vision request completion handler:
private func handleVisionRequestUpdate(_ request: VNRequest, error: Error?) {
// Dispatch to the main queue because we are touching non-atomic, non-thread safe properties of the view controller
DispatchQueue.main.async {
// make sure we have an actual result
guard let newObservation = request.results?.first as? VNDetectedObjectObservation else { return }
// prepare for next loop
self.lastObservation = newObservation
// check the confidence level before updating the UI
guard newObservation.confidence >= 0.3 else {
return
}
// calculate view rect
var transformedRect = newObservation.boundingBox
//How to convert transformedRect into AR Coordinate
self.node.position = SCNVector3Make(?.worldTransform.columns.3.x,
?.worldTransform.columns.3.y,
}
}
Please guide me to transfer the coordinate system.
Assuming the rectangle is on a horizontal plane, you can perform a hit test against the scene on all 4 corners and use 3 of those corners to calculate the width, height, center, and orientation of the rectangle.
I have a demo app available on GitHub that does exactly that: https://github.com/mludowise/ARKitRectangleDetection
The coordinates for the rectangle corners from VNRectangleObservation
will be relative to the size of the image and in different coordinates depending on the phone's rotation. You'll need multiply them by the view size and invert them based on the phone's rotation:
func convertFromCamera(_ point: CGPoint, view sceneView: ARSCNView) -> CGPoint {
let orientation = UIApplication.shared.statusBarOrientation
switch orientation {
case .portrait, .unknown:
return CGPoint(x: point.y * sceneView.frame.width, y: point.x * sceneView.frame.height)
case .landscapeLeft:
return CGPoint(x: (1 - point.x) * sceneView.frame.width, y: point.y * sceneView.frame.height)
case .landscapeRight:
return CGPoint(x: point.x * sceneView.frame.width, y: (1 - point.y) * sceneView.frame.height)
case .portraitUpsideDown:
return CGPoint(x: (1 - point.y) * sceneView.frame.width, y: (1 - point.x) * sceneView.frame.height)
}
}
Then you can perform a hit test on all 4 corners. It's important to use the type .existingPlaneUsingExtent
when performing the hit test so that ARKit returns hits for horizontal planes.
let tl = sceneView.hitTest(convertFromCamera(rectangle.topLeft, view: sceneView), types: .existingPlaneUsingExtent)
let tr = sceneView.hitTest(convertFromCamera(rectangle.topRight, view: sceneView), types: .existingPlaneUsingExtent)
let bl = sceneView.hitTest(convertFromCamera(rectangle.bottomLeft, view: sceneView), types: .existingPlaneUsingExtent)
let br = sceneView.hitTest(convertFromCamera(rectangle.bottomRight, view: sceneView), types: .existingPlaneUsingExtent)
Then it gets a little complicated...
Because each hit test could return with 0 to n results, you will need to filter out any hit tests that are contained on a different plane. You can do this by comparing the anchors for each ARHitTestResult
:
hit1.anchor == hit2.anchor
Also, you only need 3 out of 4 corners to identify the rectangle's dimensions, position, and orientation so it's okay if one corner doesn't return any hit test results. Take a look here for how I did that.
You can calculate the rectangle's width from the distance between the left and right corners (for either top or bottom). Likewise you can calculate the rectangle's height from the distance between the top & bottom corners (for either left or right).
func distance(_ a: SCNVector3, from b: SCNVector3) -> CGFloat {
let deltaX = a.x - b.x
let deltaY = a.y - b.y
let deltaZ = a.z - b.z
return CGFloat(sqrt(deltaX * deltaX + deltaY * deltaY + deltaZ * deltaZ))
}
let width = distance(right, from: left)
let height = distance(top, from: bottom)
You can calculate its position by getting the midpoint from the opposite corners of the rectangle (either topLeft & bottomRight or topRight & bottomLeft):
let midX = (c1.x + c2.x) / 2
let midY = (c1.y + c2.y) / 2
let midZ = (c1.z + c2.z) / 2
let center = SCNVector3Make(midX, midY, midZ)
You can also calculate the orientation of the rectangle (rotation along the y-axis) from the left and right corners (for either top or bottom):
let distX = right.x - left.x
let distZ = right.z - left.z
let orientation = -atan(distZ / distX)
Then put that all together and display something in AR overlaid on the rectangle. Here's an example of displaying a virtual rectangle by subclassing SCNNode
:
class RectangleNode: SCNNode {
init(center: SCNVector3, width: CGFloat, height: CGFloat, orientation: Float) {
super.init()
// Create the 3D plane geometry with the dimensions calculated from corners
let planeGeometry = SCNPlane(width: width, height: height)
let rectNode = SCNNode(geometry: planeGeometry)
// Planes in SceneKit are vertical by default so we need to rotate
// 90 degrees to match planes in ARKit
var transform = SCNMatrix4MakeRotation(-Float.pi / 2.0, 1.0, 0.0, 0.0)
// Set rotation to the corner of the rectangle
transform = SCNMatrix4Rotate(transform, orientation, 0, 1, 0)
rectNode.transform = transform
// We add the new node to ourself since we inherited from SCNNode
self.addChildNode(rectNode)
// Set position to the center of rectangle
self.position = center
}
}
The main thing to consider is that the bounding rectangle is in the 2D image, while the scene for ARKit is 3D. This means until you pick a depth, it is not defined where in 3D the bounding rectangle is.
What you should do is run a hit test against the scene to get from 2D coordinates to 3D:
let box = newObservation.boundingBox
let rectCenter = CGPoint(x: box.midX, y: box.midY)
let hitTestResults = sceneView.hitTest(rectCenter, types: [.existingPlaneUsingExtent, .featurePoint])
// Pick the hitTestResult you need (nearest?), get position via worldTransform
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With