How can I take a photo (get an CIImage) from the successful VNRectangleObservation object?
I have a video capture session running and in func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)
I do the handling, namely
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
do {
try handler.perform([request], on: pixelBuffer)
} catch {
print(error)
}
}
Should I save somewhere the pixel-buffer that I pass to the handler and manipulate on that buffer? It's a damn shame that I can not access the image as a property from the observation object :(
Any ideas?
So you're using a Vision request that produces VNRectangleObservation
s, and you want to pull out the regions of the subject image identified by those observations? Maybe perspective-project them, too, so that they're rectangular in the image plane? (There's a demo of this in the Vision session from WWDC17.)
You can extract and rectify the region with the CIPerspectiveCorrection
filter from Core Image. To set that up, you'll need to pass the points from the image observation, converted to pixel coordinates. That looks something like this:
func extractPerspectiveRect(_ observation: VNRectangleObservation, from buffer: CVImageBuffer) -> CIImage {
// get the pixel buffer into Core Image
let ciImage = CIImage(cvImageBuffer: buffer)
// convert corners from normalized image coordinates to pixel coordinates
let topLeft = observation.topLeft.scaled(to: ciImage.extent.size)
let topRight = observation.topRight.scaled(to: ciImage.extent.size)
let bottomLeft = observation.bottomLeft.scaled(to: ciImage.extent.size)
let bottomRight = observation.bottomRight.scaled(to: ciImage.extent.size)
// pass those to the filter to extract/rectify the image
return ciImage.applyingFilter("CIPerspectiveCorrection", parameters: [
"inputTopLeft": CIVector(cgPoint: topLeft),
"inputTopRight": CIVector(cgPoint: topRight),
"inputBottomLeft": CIVector(cgPoint: bottomLeft),
"inputBottomRight": CIVector(cgPoint: bottomRight),
])
}
Aside: The
scaled
function above is a convenience extension onCGPoint
to make coordinate math a bit smaller at the call site:extension CGPoint { func scaled(to size: CGSize) -> CGPoint { return CGPoint(x: self.x * size.width, y: self.y * size.height) } }
Now, that gets you a CIImage
object — those aren't really displayable images themselves, just instructions for how to process and display an image, something that can be done in many different possible ways. Many ways to display an image involve CIContext
— you can have it render out into another pixel buffer, or maybe a Metal texture if you're trying to do this processing in real-time — but not all. On the other hand, if you're just displaying static images less frequently, you can create a UIImage directly from the CIImage and display it in a UIImageView
, and UIKit will manage the underlying CIContext
and rendering process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With