Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommendations for real-time pixel-level analysis of television (TV) video

[Note: This is a rewrite of an earlier question that was considered inappropriate and closed.]

I need to do some pixel-level analysis of television (TV) video. The exact nature of this analysis is not pertinent, but it basically involves looking at every pixel of every frame of TV video, starting from an MPEG-2 transport stream. The host platform will be server-class, multiprocessor 64-bit Linux machines.

I need a library that can handle the decoding of the transport stream and present me with the image data in real-time. OpenCV and ffmpeg are two libraries that I am considering for this work. OpenCV is appealing because I have heard it has easy to use APIs and rich image analysis support, but I have no experience using it. I have used ffmpeg in the past for extracting video frame data from files for analysis, but it lacks image analysis support (though Intel's IPP can supplement).

In addition to general recommendations for approaches to this problem (excluding the actual image analysis), I have some more specific questions that would help me get started:

  1. Are ffmpeg or OpenCV commonly used in industry as a foundation for real-time video analysis, or is there something else I should be looking at?
  2. Can OpenCV decode video frames in real time, and still leave enough CPU left over to do nontrivial image analysis, also in real-time?
  3. Is sufficient to use ffpmeg for MPEG-2 transport stream decoding, or is it preferable to just use an MPEG-2 decoding library directly (and if so, which one)?
  4. Are there particular pixel formats for the output frames that ffmpeg or OpenCV is particularly efficient at producing (like RGB, YUV, or YUV422, etc)?
like image 786
Randall Cook Avatar asked Dec 05 '11 23:12

Randall Cook


2 Answers

1.
I would definitely recommend OpenCV for "real-time" image analysis. I assume by real-time you are referring to the ability to keep up with TV frame rates (e.g., NTSC (29.97 fps) or PAL (25 fps)). Of course, as mentioned in the comments, it certainly depends on the hardware you have available as well as the image size SD (480p) vs. HD (720p or 1080p). FFmpeg certainly has its quirks, but you would be hard pressed to find a better free alternative. Its power and flexibility quite impressive; I'm sure that is one of the reasons that the OpenCV developers decided to use it as the back-end for video decoding/encoding with OpenCV.

2.
I have not seen issues with high-latency while using OpenCV for decoding. How much latency can your system have? If you need to increase performance, consider using separate threads for capture/decoding and image analysis. Since you mentioned having multi-processor systems, this should take greater advantage of your processing capabilities. I would definitely recommend using the latest Intel Core-i7 (or possibly the Xeon equivalent) architecture as this will give you the best performance available today.

I have used OpenCV on several embedded systems, so I'm quite familiar with your desire for peak performance. I have found many times that it was unnecessary to process a full frame image (especially when trying to determine masks). I would highly recommend down-sampling the images if you are having difficultly processing your acquired video streams. This can sometimes instantly give you a 4-8X speedup (depending on your down-sample factor). Also on the performance front, I would definitely recommend using Intel's IPP. Since OpenCV was originally an Intel project, IPP and OpenCV blend very well together.

Finally, because image-processing is one of those "embarrassingly parallel" problem fields don't forget about the possibility of using GPUs as a hardware accelerator for your problems if needed. OpenCV has been doing a lot of work on this area as of late, so you should have those tools available to you if needed.

3.
I think FFmpeg would be a good starting point; most of the alternatives I can think of (Handbrake, mencoder, etc.) tend to use ffmpeg as a backend, but it looks like you could probably roll your own with IPP's Video Coding library if you wanted to.

4.
OpenCV's internal representation of colors is BGR unless you use something like cvtColor to convert it. If you would like to see a list of the pixel formats that are supported by FFmpeg, you can run

ffmpeg -pix_fmts 

to see what it can input and output.

like image 119
mevatron Avatar answered Nov 05 '22 01:11

mevatron


For the 4th question only:

video streams are encoded in a 422 format: YUV, YUV422, YCbCr, etc. Converting them to BGR and back (for re-encoding) eats up lots of time. So if you can write your algorithms to run on YUV you'll get an instant performance boost.

Note 1. While OpenCV natively supports BGR images, you can make it process YUV, with some care and knowledge about its internals.

By example, if you want to detect some people in the video, just take the upper half of the decoded video buffer (it contains the grayscale representation of the image) and process it.

Note 2. If you want to access the YUV image in opencv, you must use ffmpeg API directly in your app. OpenCV force the conversion from YUV to BGR in its VideoCapture API.

like image 36
Sam Avatar answered Nov 05 '22 03:11

Sam