Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding an interesting frame in a video

Tags:

Does anyone know of an algorithm that I could use to find an "interesting" representative thumbnail for a video?

I have say 30 bitmaps and I would like to choose the most representative one as the video thumbnail.

The obvious first step would be eliminate all black frames. Then perhaps look for the "distance" between the various frames and choose something that is close to the avg.

Any ideas here or published papers that could help out?

like image 342
Sam Saffron Avatar asked Nov 13 '08 07:11

Sam Saffron


People also ask

What is video framing?

Definitions: A frame is a single image of film or video. Framing (a shot) involves composing the visual content of a series of frames as seen from a single point of view, i.e., a fixed camera. In frame is the term used by screenwriters to indicate the entrance of a person or thing into a framed shot.

How do I save a single frame from a video?

On the Selection and media controls toolbar, click the Extract Frames button and select Extract Current Frame. In the Extract Current Frame dialog box, select a folder in which to save your file. In the File name field, type a filename. Click the Save as type drop-down list and select a file format.


1 Answers

If the video contains structure, i.e. several shots, then the standard techniques for video summarisation involve (a) shot detection, then (b) use the first, mid, or nth frame to represent each shot. See [1].

However, let us assume you wish to find an interesting frame in a single continuous stream of frames taken from a single camera source. I.e. a shot. This is the "key frame detection" problem that is widely discussed in IR/CV (Information Retrieval, Computer Vision) texts. Some illustrative approaches:

  • In [2] a mean colour histogram is computed for all frames and the key-frame is that with the closest histogram. I.e. we select the best frame in terms of it's colour distribution.
  • In [3] we assume that camera stillness is an indicator of frame importance. As suggested by Beds, above. We pick the still frames using optic-flow and use that.
  • In [4] each frame is projected into some high dimensional content space, we find those frames at the corners of the space and use them to represent the video.
  • In [5] frames are evaluated for importance using their length and novelty in content space.

In general, this is a large field and there are lots of approaches. You can look at the academic conferences such as The International Conference on Image and Video Retrieval (CIVR) for the latest ideas. I find that [6] presents a useful detailed summary of video abstraction (key-frame detection and summarisation).

For your "find the best of 30 bitmaps" problem I would use an approach like [2]. Compute a frame representation space (e.g. a colour histogram for the frame), compute a histogram to represent all frames, and use the frame with the minimum distance between the two (e.g. pick a distance metric that's best for your space. I would try Earth Mover's Distance).

  1. M.S. Lew. Principles of Visual Information Retrieval. Springer Verlag, 2001.
  2. B. Gunsel, Y. Fu, and A.M. Tekalp. Hierarchical temporal video segmentation and content characterization. Multimedia Storage and Archiving Systems II, SPIE, 3229:46-55, 1997.
  3. W. Wolf. Key frame selection by motion analysis. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 1228-1231, 1996.
  4. L. Zhao, W. Qi, S.Z. Li, S.Q. Yang, and H.J. Zhang. Key-frame extraction and shot retrieval using Nearest Feature Line. In IW-MIR, ACM MM, pages 217-220, 2000.
  5. S. Uchihashi. Video Manga: Generating semantically meaningful video summaries. In Proc. ACM Multimedia 99, Orlando, FL, Nov., pages 383-292, 1999.
  6. Y. Li, T. Zhang, and D. Tretter. An overview of video abstraction techniques. Technical report, HP Laboratory, July 2001.
like image 172
graveca Avatar answered Sep 20 '22 14:09

graveca