Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the standard techniques for removing a segmentation (such as a human or bird) from a video?

Let's say you are taking a video (with the camera in a steady position) and a bird flies through the view of the camera. It should be possible to do image segmentation and automatically remove this bird from the video.

What are these styles of algorithms called and how are they normally accomplished?

like image 433
amssage Avatar asked Oct 07 '10 21:10

amssage


2 Answers

There's a technique called Simple Image Object Extraction (SIOX) - it uses a technique to identify foreground vs. background objects in still and video images. The open source GIMP editor has an implementation of it, and there's more information about it here.

From the overview:

SIOX stands for Simple Interactive Object Extraction and is a solution for extracting foreground from still images with very little user interaction. SIOX is fast, noise robust, and can therefore also be used for the segmentation of videos. It avoids many of the drawbacks of graph-based segmentation methods but performs about equally well on different benchmarks. SIOX is open and free (Apache License) and the authors have intentionally not patented any part of the technology. As a result, it has been integrated into several open-source image manipulation programs over the past years. SIOX is the underlying algorithm of the foreground extraction tool in the GNU Image Manipulation Program (GIMP) and is part of the tracer tool in Inkscape. SIOX originates from E-Chalk where an instructor standing in front of an electronic chalkboard is segmented. Variants of SIOX are being used for robotic vision and for improving 3D time-of-flight camera segmentation.

Here's a link to the Java Reference Implementation of SIOX.

Here's a link to the PDF with details about how a variation of the algorithm works.

You should be able to adapt it to use inter-frame interpolation to remove a specific foreground object from each frame of a video by using temporal data from surrounding frames.

like image 88
LBushkin Avatar answered Sep 28 '22 02:09

LBushkin


If the camera is fixed and there isn't too much motion in the scene, then I would suggest a method based on background subtraction.

Step 1: Compute background for each frame of the video. There are complicated algorithms for doing this, but a very simple and effective one would be to compute the median value of every pixel in the image across a 3 second time window. Longer if the object in question is moving slowly. Incidentally, if you just perform this kind of filtering it will remove most moving objects from the video if the camera is fixed, hence my earlier question about all objects vs. one object.

Step 2: Mark the regions you want to remove in each frame with a brush tool, and replace them with the background pixels. Don't bother with a fine brush or lasso tool as any non-object pixels you mark will just be replaced with their filtered version. You could probably use the same brush marks for several frames since the boundary is not so important. If the object is the only thing moving in the scene, you could just mark the entire frame and have it replaced with the background.

Anyways, to answer your more general question, the topic you want to research is called inpainting for images and video. There is quite a bit of literature out there on the subject, what I described was just a super simple method you could implement in an hour or so with opencv.

like image 36
Doug Avatar answered Sep 28 '22 01:09

Doug