Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's FFmpeg doing with avcodec_send_packet()?

I'm trying to optimise a piece of software for playing video, which internally uses the FFmpeg libraries for decoding. We've found that on some large (4K, 60fps) video, it sometimes takes longer to decode a frame than that frame should be displayed for; sadly, because of the problem domain, simply buffering/skipping frames is not an option.

However, it appears that the FFmpeg executable is able to decode the video in question fine, at about 2x speed, so I've been trying to work out what we're doing wrong.

I've written a very stripped-back decoder program for testing; the source is here (it's about 200 lines). From profiling it, it appears that the one major bottleneck during decoding is the avcodec_send_packet() function, which can take up to 50ms per call. However, measuring the same call in FFmpeg shows strange behaviour:

Yes, you can embed images

(these are the times taken for each call to avcodec_send_packet() in milliseconds, when decoding a 4K 25fps VP9-encoded video.)

Basically, it seems that when FFmpeg uses this function, it only really takes any amount of time to complete every N calls, where N is the number of threads being used for decoding. However, both my test decoder and the actual product use 4 threads for decoding, and this doesn't happen; when using frame-based threading, the test decoder behaves like FFmpeg using only 1 thread. This would seem to indicate that we're not using multithreading at all, but we've still seen performance improvements by using more threads.

FFmpeg's results average out to being about twice as fast overall as our decoders, so clearly we're doing something wrong. I've been reading through FFmpeg's source to try to find any clues, but it's so far eluded me.

My question is: what's FFmpeg doing here that we're not? Alternatively, how can we increase the performance of our decoder?

Any help is greatly appreciated.

like image 367
Jim Avatar asked Mar 15 '19 16:03

Jim


1 Answers

I was facing the same problem. It took me quite a while to figure out a solution which I want to share here for future references:

Enable multithreading for the decoder. Per default the decoder only uses one thread, depending on the decoder, multithreading can speed up decoding drastically.

Assuming you have AVFormatContext *format_ctx, a matching codec AVCodec* codec and AVCodecContext* codec_ctx (allocated using avcodec_alloc_context3).

Before opening the codec context (using avcodec_open2) you can configure multithreading. Check the capabilites of the codec in order to decide which kind of multithreading you can use:

// set codec to automatically determine how many threads suits best for the decoding job
codec_ctx->thread_count = 0;

if (codec->capabilities | AV_CODEC_CAP_FRAME_THREADS)
   codec_ctx->thread_type = FF_THREAD_FRAME;
else if (codec->capabilities | AV_CODEC_CAP_SLICE_THREADS)
   codec_ctx->thread_type = FF_THREAD_SLICE;
else
   codec_ctx->thread_count = 1; //don't use multithreading

Another speed-up I found out is the following: keep sending packets to the decoder (thats what avcodec_send_packet() is doing) until you get AVERROR(EAGAIN) as return value. This means the internal decoder buffers are full and you first need to collect the decoded frames (but remember to send this last packet again after the decoder is empty again). Now you can collect the decoded frames using avcodec_receive_frame until you get AVERROR(EAGAIN) again. Some decoders work way faster when they have mutiple frames queued for decoding (thats what the decoder does when codec_ctx->thread_type = FF_THREAD_FRAME is set).

like image 156
imikbox Avatar answered Oct 17 '22 23:10

imikbox