Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I convert an FFmpeg AVFrame with pixel format AV_PIX_FMT_CUDA to a new AVFrame with pixel format AV_PIX_FMT_RGB

I have a simple C++ application that uses FFmpeg 3.2 to receive an H264 RTP stream. In order to save CPU, I'm doing the decoding part with the codec h264_cuvid. My FFmpeg 3.2 is compiled with hw acceleration enabled. In fact, if I do the command:

ffmpeg -hwaccels

I get

cuvid

This means that my FFmpeg setup has everything OK to "speak" with my NVIDIA card. The frames that the function avcodec_decode_video2 provides me have the pixel format AV_PIX_FMT_CUDA. I need to convert those frames to new ones with AV_PIX_FMT_RGB. Unfortunately, I can't do the conversion using the well knwon functions sws_getContext and sws_scale because the pixel format AV_PIX_FMT_CUDA is not supported. If I try with swscale I get the error:

"cuda is not supported as input pixel format"

Do you know how to convert an FFmpeg AVFrame from AV_PIX_FMT_CUDA to AV_PIX_FMT_RGB ? (pieces of code would be very appreciated)

like image 445
costef Avatar asked Nov 01 '17 06:11

costef


3 Answers

This my understanding of the hardware decoding on the latest FFMPeg 4.1 version. Below are my conclusion after studying the source code.

First I recommend to inspire yourself from the hw_decode example:

https://github.com/FFmpeg/FFmpeg/blob/release/4.1/doc/examples/hw_decode.c

With the new API, when you send a packet to the encoder using avcodec_send_packet(), then use avcodec_receive_frame() to retrieve the decoded frame.

There are two different kinds of AVFrame: software one, which is stored in the "CPU" memory (a.k.a RAM), and hardware one, which is stored in the graphic card memory.

Getting AVFrame from the hardware

To retrieve the hardware frame and get it into a readable, convertible (with swscaler) AVFrame, av_hwframe_transfer_data() needs to be used to retrieve the data from the graphic card. Then look at the pixel format of the retrieved frame, it is usually NV12 format when using nVidia decoding.

// According to the API, if the format of the AVFrame is set before calling 
// av_hwframe_transfer_data(), the graphic card will try to automatically convert
// to the desired format. (with some limitation, see below)
m_swFrame->format = AV_PIX_FMT_NV12;

// retrieve data from GPU to CPU
err = av_hwframe_transfer_data(
     m_swFrame, // The frame that will contain the usable data.
     m_decodedFrame, // Frame returned by avcodec_receive_frame()
     0);

const char* gpu_pixfmt = av_get_pix_fmt_name((AVPixelFormat)m_decodedFrame->format);
const char* cpu_pixfmt = av_get_pix_fmt_name((AVPixelFormat)m_swFrame->format);

Listing supported "software" pixel formats

Side note here if you want to select the pixel format, not all AVPixelFormat are supported. AVHWFramesConstraints is your friend here:

AVHWDeviceType type = AV_HWDEVICE_TYPE_CUDA;
int err = av_hwdevice_ctx_create(&hwDeviceCtx, type, nullptr, nullptr, 0);
if (err < 0) {
    // Err
}

AVHWFramesConstraints* hw_frames_const = av_hwdevice_get_hwframe_constraints(hwDeviceCtx, nullptr);
if (hw_frames_const == nullptr) {
    // Err
}

// Check if we can convert the pixel format to a readable format.
AVPixelFormat found = AV_PIX_FMT_NONE;
for (AVPixelFormat* p = hw_frames_const->valid_sw_formats; 
    *p != AV_PIX_FMT_NONE; p++)
{
    // Check if we can convert to the desired format.
    if (sws_isSupportedInput(*p))
    {
        // Ok! This format can be used with swscale!
        found = *p;
        break;
    }
}

// Don't forget to free the constraint object.
av_hwframe_constraints_free(&hw_frames_const);

// Attach your hw device to your codec context if you want to use hw decoding.
// Check AVCodecContext.hw_device_ctx!

Finally, a quicker way is probably the av_hwframe_transfer_get_formats() function, but you need to decode at least one frame.

Hope this will help!

like image 181
lp35 Avatar answered Oct 12 '22 07:10

lp35


You must use vf_scale_npp to do this. You can use either nppscale_deinterleave or nppscale_resize depend on your needs.

Both has same input parameters, which are AVFilterContext that should be initialize with nppscale_init, NPPScaleStageContext which takes your in/out pixel format and two AVFrames which of course are your input and output frames.

For more information you can see npplib\nppscale definition which will do the CUDA-accelerated format conversion and scaling since ffmpeg 3.1.

Anyway, I recommend to use NVIDIA Video Codec SDK directly for this purpose.

like image 2
HMD Avatar answered Oct 12 '22 06:10

HMD


I am not an ffmpeg expert, but I had a similar problem and managed to solve it. I was getting AV_PIX_FMT_NV12 from cuvid (mjpeg_cuvid decoder), and wanted AV_PIX_FMT_CUDA for cuda processing.

I found that setting the pixel format just before decoding the frame worked.

    pCodecCtx->pix_fmt = AV_PIX_FMT_CUDA; // change format here
    avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);
    // do something with pFrame->data[0] (Y) and pFrame->data[1] (UV)

You can check which pixel formats are supported by your decoder using pix_fmts:

    AVCodec *pCodec = avcodec_find_decoder_by_name("mjpeg_cuvid");
    for (int i = 0; pCodec->pix_fmts[i] != AV_PIX_FMT_NONE; i++)
            std::cout << pCodec->pix_fmts[i] << std::endl;

I'm sure there's a better way of doing this, but I then used this list to map the integer pixel format ids to human readable pixel formats.

If that doesn't work, you can do a cudaMemcpy to transfer your pixels from device to host:

    cudaMemcpy(pLocalBuf pFrame->data[0], size, cudaMemcpyDeviceToHost);

The conversion from YUV to RGB/RGBA can be done many ways. This example does it using the libavdevice API.

like image 2
Jean B. Avatar answered Oct 12 '22 08:10

Jean B.