Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Full Video Screen Capture with Video background and painting overlay performance issues

Long time stackoverflow reader, first time poster.

I'm attempting to create an iPad app called CloudWriter. The concept for the app is drawing the shapes you see in the clouds. After downloading the app, upon launching CloudWriter, the user would be presented with a live video background (from the rear camera), with an OpenGL drawing layer on-top of that. A user would be able to open the app, point the iPad at clouds in the sky and draw what they see on the display.

A major feature of the application is for the user to record a video screen capture of what happens on the display during a session. The live video feed and the "drawing" view will become a flat (merged) video.

Some assumptions and background info about how this currently works.

  • Using Apples AVCamCaptureManager, part of the AVCam sample project, as a foundation for much of the camera related code.
  • Init the AVCamCapture session using AVCaptureSessionPresetMedium as the preset.
  • Begin to pipe the Camera feed out as a background via the videoPreviewLayer.
  • Overlay that live videoPreviewLayer with a view that allows "drawing" (finger-paint style) using openGL. The "drawing" view background is [UIColor clearColor].

At this point, the idea is that a user could point the iPad3 camera towards some clouds in the sky and draw the shapes they see. This functionality works flawlessly. I begin to run into performance problems when I attempt to make a "flat" video screen capture of the users session. The resulting "flat" video would have the camera input overlaid in real time with the users drawing's.

A good example of an app that has functionality similar to what we are looking for is Board Cam, available in the App Store.

To initiate the process, there is a "record" button visible in the view at all times. When a user taps the record button, the expectation is that until the record button is tapped again, the session will be recorded as a "flat" video screen capture.

When the user taps the "Record" button, the following happens in code

  • AVCaptureSessionPreset is changed from AVCaptureSessionPresetMedium to AVCaptureSessionPresetPhoto, allowing access to

    - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
    
  • isRecording value set to YES.
  • didOutputSampleBuffer begins to get data and creates an image from the current video buffer data. It does this with a call to

    - (UIImage *) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer
    
    • self.currentImage gets set to this
  • The applications root view controller begins to override drawRect to create a flattened image, used as an individual frame in the final video.

  • That frame is written to the flat video

To create a flat image, used as an individual frame, in the root ViewController's drawRect function, we grab the last frame received by AVCamCaptureManager's didOutputSampleBuffer code. That is below

- (void) drawRect:(CGRect)rect {


    NSDate* start = [NSDate date];
    CGContextRef context = [self createBitmapContextOfSize:self.frame.size];

    //not sure why this is necessary...image renders upside-down and mirrored
    CGAffineTransform flipVertical = CGAffineTransformMake(1, 0, 0, -1, 0, self.frame.size.height);
    CGContextConcatCTM(context, flipVertical);

    if( isRecording)
        [[self.layer presentationLayer] renderInContext:context];

    CGImageRef cgImage = CGBitmapContextCreateImage(context);
    UIImage* background = [UIImage imageWithCGImage: cgImage];
    CGImageRelease(cgImage);

    UIImage *bottomImage = background;


    if(((AVCamCaptureManager *)self.captureManager).currentImage != nil && isVideoBGActive )
    {

        UIImage *image = [((AVCamCaptureManager *)self.mainContentScreen.captureManager).currentImage retain];//[UIImage
        CGSize newSize = background.size;
        UIGraphicsBeginImageContext( newSize );
        // Use existing opacity as is
        if( isRecording )
        {
            if( [self.mainContentScreen isVideoBGActive] && _recording)
            {
                [image drawInRect:CGRectMake(0,0,newSize.width,newSize.height)];
            }
            // Apply supplied opacity

            [bottomImage drawInRect:CGRectMake(0,0,newSize.width,newSize.height) blendMode:kCGBlendModeNormal alpha:1.0];

        }
        UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

        UIGraphicsEndImageContext();

        self.currentScreen = newImage;



        [image release];
    }

    if (isRecording) {
        float millisElapsed = [[NSDate date] timeIntervalSinceDate:startedAt] * 1000.0;
        [self writeVideoFrameAtTime:CMTimeMake((int)millisElapsed, 1000)];
    }

    float processingSeconds = [[NSDate date] timeIntervalSinceDate:start];
    float delayRemaining = (1.0 / self.frameRate) - processingSeconds;

    CGContextRelease(context);

    //redraw at the specified framerate
    [self performSelector:@selector(setNeedsDisplay) withObject:nil afterDelay:delayRemaining > 0.0 ? delayRemaining : 0.01];  
}

createBitmapContextOfSize is below

- (CGContextRef) createBitmapContextOfSize:(CGSize) size {
    CGContextRef    context = NULL;
    CGColorSpaceRef colorSpace = nil;
    int             bitmapByteCount;
    int             bitmapBytesPerRow;

    bitmapBytesPerRow   = (size.width * 4);
    bitmapByteCount     = (bitmapBytesPerRow * size.height);
    colorSpace = CGColorSpaceCreateDeviceRGB();
    if (bitmapData != NULL) {
        free(bitmapData);
    }
    bitmapData = malloc( bitmapByteCount );
    if (bitmapData == NULL) {
        fprintf (stderr, "Memory not allocated!");
        CGColorSpaceRelease( colorSpace );
        return NULL;
    }

    context = CGBitmapContextCreate (bitmapData,
                                     size.width ,
                                     size.height,
                                     8,      // bits per component
                                     bitmapBytesPerRow,
                                     colorSpace,
                                     kCGImageAlphaPremultipliedFirst);

    CGContextSetAllowsAntialiasing(context,NO);
    if (context== NULL) {
        free (bitmapData);
        fprintf (stderr, "Context not created!");
        CGColorSpaceRelease( colorSpace );
        return NULL;
    }

    //CGAffineTransform transform = CGAffineTransformIdentity;
    //transform = CGAffineTransformScale(transform, size.width * .25, size.height * .25);
    //CGAffineTransformScale(transform, 1024, 768);

    CGColorSpaceRelease( colorSpace );

    return context;
}

- (void)captureOutput:didOutputSampleBuffer fromConnection

// Delegate routine that is called when a sample buffer was written
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
       fromConnection:(AVCaptureConnection *)connection
{

    // Create a UIImage from the sample buffer data
    [self imageFromSampleBuffer:sampleBuffer];
}

- (UIImage *) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer below

// Create a UIImage from sample buffer data - modifed not to return a UIImage *, rather store it in self.currentImage
- (UIImage *) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer
{

    // unlock the memory, do other stuff, but don't forget:

    // Get a CMSampleBuffer's Core Video image buffer for the media data
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    // Lock the base address of the pixel buffer
    CVPixelBufferLockBaseAddress(imageBuffer, 0);

   // uint8_t *tmp = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
    int bytes = CVPixelBufferGetBytesPerRow(imageBuffer); // determine number of bytes from height * bytes per row
    //void *baseAddress = malloc(bytes);
    size_t height = CVPixelBufferGetHeight(imageBuffer);     
    uint8_t *baseAddress = malloc( bytes * height );
    memcpy( baseAddress, CVPixelBufferGetBaseAddress(imageBuffer), bytes * height );
    size_t width = CVPixelBufferGetWidth(imageBuffer);

    // Create a device-dependent RGB color space
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    // Create a bitmap graphics context with the sample buffer data
    CGContextRef context = CGBitmapContextCreate(baseAddress, width, height, 8,
                                                 bytes, colorSpace, kCGBitmapByteOrderDefault | kCGImageAlphaPremultipliedFirst);


    // CGContextScaleCTM(context, 0.25, 0.25); //scale down to size
    // Create a Quartz image from the pixel data in the bitmap graphics context
    CGImageRef quartzImage = CGBitmapContextCreateImage(context);
    // Unlock the pixel buffer
    CVPixelBufferUnlockBaseAddress(imageBuffer,0);

    // Free up the context and color space
    CGContextRelease(context);
    CGColorSpaceRelease(colorSpace);
    free(baseAddress);

    self.currentImage = [UIImage imageWithCGImage:quartzImage scale:0.25 orientation:UIImageOrientationUp];

    // Release the Quartz image
    CGImageRelease(quartzImage);


    return nil; 
}

Finally, I write this out to disk using writeVideoFrameAtTime:CMTimeMake, code below:

-(void) writeVideoFrameAtTime:(CMTime)time {
    if (![videoWriterInput isReadyForMoreMediaData]) {
        NSLog(@"Not ready for video data");
    }
    else {
        @synchronized (self) {
            UIImage* newFrame = [self.currentScreen retain];
            CVPixelBufferRef pixelBuffer = NULL;
            CGImageRef cgImage = CGImageCreateCopy([newFrame CGImage]);
            CFDataRef image = CGDataProviderCopyData(CGImageGetDataProvider(cgImage));

            if( image == nil )
            {
                [newFrame release];
                CVPixelBufferRelease( pixelBuffer );
                CGImageRelease(cgImage);
                return;
            }

            int status = CVPixelBufferPoolCreatePixelBuffer(kCFAllocatorDefault, avAdaptor.pixelBufferPool, &pixelBuffer);
            if(status != 0){
                //could not get a buffer from the pool
                NSLog(@"Error creating pixel buffer:  status=%d", status);
            }

            // set image data into pixel buffer
            CVPixelBufferLockBaseAddress( pixelBuffer, 0 );
            uint8_t* destPixels = CVPixelBufferGetBaseAddress(pixelBuffer);
            CFDataGetBytes(image, CFRangeMake(0, CFDataGetLength(image)), destPixels);  //XXX:  will work if the pixel buffer is contiguous and has the same bytesPerRow as the input data

            if(status == 0){
                BOOL success = [avAdaptor appendPixelBuffer:pixelBuffer withPresentationTime:time];
                if (!success)
                    NSLog(@"Warning:  Unable to write buffer to video");
            }

            //clean up
            [newFrame release];
            CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
            CVPixelBufferRelease( pixelBuffer );
            CFRelease(image);
            CGImageRelease(cgImage);
        }

    }

}

As soon as isRecording gets set to YES, performance on the iPad 3 goes from about 20FPS to maybe 5FPS. Using Insturments, I am able to see that the following chunk of code (from drawRect:) is what causes the performance to drop to unusable levels.

 if( _recording )
        {
            if( [self.mainContentScreen isVideoBGActive] && _recording)
            {
                [image drawInRect:CGRectMake(0,0,newSize.width,newSize.height)];
            }
            // Apply supplied opacity

            [bottomImage drawInRect:CGRectMake(0,0,newSize.width,newSize.height) blendMode:kCGBlendModeNormal alpha:1.0];

        }

It's my understanding that, because I'm capturing the full screen, we lose all of the benefits "drawInRect" is supposed to give. Specifically, I'm talking about quicker redraws because in theory, we only update a small portion (the passed in CGRect) of the display. Again, capturing full screen, i'm not sure drawInRect can provide nearly as much benefit.

To improve performance, I'm thinking that if I were to scale down the image that imageFromSampleBuffer provides and the current context of the drawing view, I would see an increase in frame rate. Unfortunately, CoreGrapics.Framework isn't something I've worked with in the past so I don't know that I will be able effectively to tweak performance to acceptable levels.

Any CoreGraphics Guru's have input?

Also, ARC is turned off for some of the code, analyzer shows one leak but, i believe it to be a false positive.

Coming Soon, CloudWriter™, where the sky's the limit!

like image 437
ThatOneGuy Avatar asked Nov 12 '22 20:11

ThatOneGuy


1 Answers

If you want decent recording performance, you're going to need to avoid redrawing things using Core Graphics. Stick to pure OpenGL ES.

You say that you have your finger painting done in OpenGL ES already, so you should be able to render that to a texture. The live video feed can also be directed to a texture. From there, you can do an overlay blend of the two based on the alpha channel in your finger painting texture.

This is pretty easy to do using OpenGL ES 2.0 shaders. In fact, my GPUImage open source framework can handle the video capture and blending portions of this (see the FilterShowcase sample application for an example of images overlaid on video), if you supply the rendered texture from your painting code. You will have to make sure that the painting is using OpenGL ES 2.0, not 1.1, and that it has the same share group as the GPUImage OpenGL ES context, but I show how to do that in the CubeExample application.

I also handle the video recording for you in GPUImage in a high-performance manner by using texture caches when available (on iOS 5.0+).

You should be able to record this blending at a solid 30 FPS for 720p video (iPad 2) or 1080p video (iPad 3) by using something like my framework and staying within OpenGL ES.

like image 54
Brad Larson Avatar answered Nov 15 '22 12:11

Brad Larson