Using DJI video feed with Vision Framework

Question

I'm working on an app that uses the video feed from the DJI Mavic 2 and runs it through a machine learning model to identify objects.

I managed to get my app to preview the feed from the drone using this sample DJI project, but I'm having a lot of trouble trying to get the video data into a format that's usable by the Vision framework.

I used this example from Apple as a guide to create my model (which is working!) but it looks I need to create a VNImageRequestHandler object which is created with a cvPixelBuffer of type CMSampleBuffer in order to use Vision.

Any idea how to make this conversion? Is there a better way to do this?

class DJICameraViewController: UIViewController, DJIVideoFeedListener, DJISDKManagerDelegate, DJICameraDelegate, VideoFrameProcessor {

// ...

func videoFeed(_ videoFeed: DJIVideoFeed, didUpdateVideoData rawData: Data) {
    let videoData = rawData as NSData
    let videoBuffer = UnsafeMutablePointer<UInt8>.allocate(capacity: videoData.length)
    videoData.getBytes(videoBuffer, length: videoData.length)
    DJIVideoPreviewer.instance().push(videoBuffer, length: Int32(videoData.length))        
}

// MARK: VideoFrameProcessor Protocol Implementation
func videoProcessorEnabled() -> Bool {
    // This is never called
    return true
}

func videoProcessFrame(_ frame: UnsafeMutablePointer<VideoFrameYUV>!) {
    // This is never called
    let pixelBuffer = frame.pointee.cv_pixelbuffer_fastupload as! CVPixelBuffer

    let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: exifOrientationFromDeviceOrientation(), options: [:])

    do {
        try imageRequestHandler.perform(self.requests)
    } catch {
        print(error)
    }
}
} // End of DJICameraViewController class

EDIT: from what I've gathered from DJI's (spotty) documentation, it looks like the video feed is compressed H264. They claim the DJIWidget includes helper methods for decompression, but I haven't had success in understanding how to use them correctly because there is no documentation surrounding its use.

EDIT 2: Here's the issue I created on GitHub for the DJIWidget framework

EDIT 3: Updated code snippet with additional methods for VideoFrameProcessor, removing old code from videoFeed method

EDIT 4: Details about how to extract the pixel buffer successfully and utilize it can be found in this comment from GitHub

dji-dev-Tim · Accepted Answer

The steps ：

Call DJIVideoPreviewer’s push:length: method and input the rawData. Inside DJIVideoPreviewer, if you have used VideoPreviewerSDKAdapter please skip this. (H.264 parsing and decoding steps will be performed once you do this.)
Conform to the VideoFrameProcessor protocol and call DJIVideoPreviewer.registFrameProcessor to register the VideoFrameProcessor protocol object.
VideoFrameProcessor protocol’s videoProcessFrame: method will output the VideoFrameYUV data.
Get the CVPixelBuffer data. VideoFrameYUV struct has a cv_pixelbuffer_fastupload field, this data is actually of type CVPixelBuffer when hardware decoding is turned on. If you are using software decoding, you will need to create a CVPixelBuffer yourself and copy the data from the VideoFrameYUV's luma, chromaB and chromaR field.

Code:

VideoFrameYUV* yuvFrame; // the VideoFrameProcessor output
CVPixelBufferRef pixelBuffer = NULL;
CVReturn resulst = CVPixelBufferCreate(kCFAllocatorDefault,
                                       yuvFrame-> width,
                                       yuvFrame -> height, 
                                  kCVPixelFormatType_420YpCbCr8Planar,
                                       NULL,
                                       &pixelBuffer);
if (kCVReturnSuccess != CVPixelBufferLockBaseAddress(pixelBuffer, 0) || pixelBuffer == NULL) {
    return;
}
long yPlaneWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0);
long yPlaneHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer,0);
long uPlaneWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 1);
long uPlaneHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1);
long vPlaneWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 2);
long vPlaneHeight =  CVPixelBufferGetHeightOfPlane(pixelBuffer, 2);
uint8_t* yDestination = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
memcpy(yDestination, yuvFrame->luma, yPlaneWidth * yPlaneHeight);
uint8_t* uDestination = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
memcpy(uDestination, yuvFrame->chromaB, uPlaneWidth * uPlaneHeight);
uint8_t* vDestination = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 2);
memcpy(vDestination, yuvFrame->chromaR, vPlaneWidth * vPlaneHeight);
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);

Using DJI video feed with Vision Framework

Tags:

ios

swift

dji-sdk

ios-vision

Spencer

1 Answers

dji-dev-Tim

Recent Activity

Donate For Us

Using DJI video feed with Vision Framework

Tags:

ios

swift

dji-sdk

ios-vision

Spencer

1 Answers

dji-dev-Tim

Related questions

Recent Activity

Donate For Us