Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing a multiple input filter graph with the Libavfilter library in Android NDK

I am trying to use the overlay filter with multiple input sources, for an Android app. Basically, I want to overlay multiple video sources on top of a static image. I have looked at the sample that comes with ffmpeg and implemented my code based on that, but things don't seem to be working as expected.

In the ffmpeg filtering sample there seems to be a single video input. I have to handle multiple video inputs and I am not sure that my solution is the correct one. I have tried to find other examples, but looks like this is the only one.

Here is my code:

AVFilterContext **inputContexts;
AVFilterContext *outputContext;
AVFilterGraph *graph;

int initFilters(AVFrame *bgFrame, int inputCount, AVCodecContext **codecContexts, char *filters)
{
    int i;
    int returnCode;
    char args[512];
    char name[9];
    AVFilterInOut **graphInputs = NULL;
    AVFilterInOut *graphOutput = NULL;

    AVFilter *bufferSrc  = avfilter_get_by_name("buffer");
    AVFilter *bufferSink = avfilter_get_by_name("buffersink");

    graph = avfilter_graph_alloc();
    if(graph == NULL)
        return -1;

    //allocate inputs
    graphInputs = av_calloc(inputCount + 1, sizeof(AVFilterInOut *));
    for(i = 0; i <= inputCount; i++)
    {
        graphInputs[i] = avfilter_inout_alloc();
        if(graphInputs[i] == NULL)
            return -1;
    }

    //allocate input contexts
    inputContexts = av_calloc(inputCount + 1, sizeof(AVFilterContext *));
    //first is the background
    snprintf(args, sizeof(args), "video_size=%dx%d:pix_fmt=%d:time_base=1/1:pixel_aspect=0", bgFrame->width, bgFrame->height, bgFrame->format);
    returnCode = avfilter_graph_create_filter(&inputContexts[0], bufferSrc, "background", args, NULL, graph);
    if(returnCode < 0)
        return returnCode;
    graphInputs[0]->filter_ctx = inputContexts[0];
    graphInputs[0]->name = av_strdup("background");
    graphInputs[0]->next = graphInputs[1];

    //allocate the rest
    for(i = 1; i <= inputCount; i++)
    {
        AVCodecContext *codecCtx = codecContexts[i - 1];
        snprintf(args, sizeof(args), "video_size=%dx%d:pix_fmt=%d:time_base=%d/%d:pixel_aspect=%d/%d",
                    codecCtx->width, codecCtx->height, codecCtx->pix_fmt,
                    codecCtx->time_base.num, codecCtx->time_base.den,
                    codecCtx->sample_aspect_ratio.num, codecCtx->sample_aspect_ratio.den);
        snprintf(name, sizeof(name), "video_%d", i);

        returnCode = avfilter_graph_create_filter(&inputContexts[i], bufferSrc, name, args, NULL, graph);
        if(returnCode < 0)
            return returnCode;

        graphInputs[i]->filter_ctx = inputContexts[i];
        graphInputs[i]->name = av_strdup(name);
        graphInputs[i]->pad_idx = 0;
        if(i < inputCount)
        {
            graphInputs[i]->next = graphInputs[i + 1];
        }
        else
        {
            graphInputs[i]->next = NULL;
        }
    }

    //allocate outputs
    graphOutput = avfilter_inout_alloc();   
    returnCode = avfilter_graph_create_filter(&outputContext, bufferSink, "out", NULL, NULL, graph);
    if(returnCode < 0)
        return returnCode;
    graphOutput->filter_ctx = outputContext;
    graphOutput->name = av_strdup("out");
    graphOutput->next = NULL;
    graphOutput->pad_idx = 0;

    returnCode = avfilter_graph_parse_ptr(graph, filters, graphInputs, &graphOutput, NULL);
    if(returnCode < 0)
        return returnCode;

    returnCode = avfilter_graph_config(graph, NULL);
        return returnCode;

    return 0;
}

The filters argument of the function is passed on to avfilter_graph_parse_ptr and it can looks like this: [background] scale=512x512 [base]; [video_1] scale=256x256 [tmp_1]; [base][tmp_1] overlay=0:0 [out]

The call breaks after the call to avfilter_graph_config with the warning: Output pad "default" with type video of the filter instance "background" of buffer not connected to any destination and the error Invalid argument.

What is it that I am not doing correctly?

EDIT: The are two issues that I have discovered:

  1. Looks like the description of avfilter_graph_parse_ptr is a bit vague. The ouputs parameter represents a list of the current outputs of the graph, in my case that being the graphInputs variable, because these are the outputs from the buffer filter. The inputs parameter represents a list of the current inputs of the graph, in this case this is the graphOutput variable, because it represents the input to the buffersink filter.

  2. I did some testing with a scale filter and a single input. It seems that the name of the AVFilterInOut structure required by avfilter_graph_parse_ptr needs to be in. I have tried with different versions: in_1, in_link_1. None of them work and I have not been able to find any documentation related to this.

So the issue still remains. How do I implement a filter graph with multiple inputs?

like image 913
gookman Avatar asked Mar 14 '14 20:03

gookman


3 Answers

I have found a simple solution to the problem. This involves replacing the avfilter_graph_parse_ptr with avfilter_graph_parse2 and adding the buffer and buffersink filters to the filters parameter of avfilter_graph_parse2.

So, in the simple case where you have one background image and one input video the value of the filters parameter should look like this:

buffer=video_size=1024x768:pix_fmt=2:time_base=1/25:pixel_aspect=3937/3937 [in_1]; buffer=video_size=1920x1080:pix_fmt=0:time_base=1/180000:pixel_aspect=0/1 [in_2]; [in_1] [in_2] overlay=0:0 [result]; [result] buffersink

The avfilter_graph_parse2 will make all the graph connections and initialize all the filters. The filter contexts for the input buffers and for the output buffer can be retrieved from the graph itself at the end. These are used to add/get frames from the filter graph.

A simplified version of the code looks like this:

AVFilterContext **inputContexts;
AVFilterContext *outputContext;
AVFilterGraph *graph;

int initFilters(AVFrame *bgFrame, int inputCount, AVCodecContext **codecContexts)
{
    int i;
    int returnCode;
    char filters[1024];
    AVFilterInOut *gis = NULL;
    AVFilterInOut *gos = NULL;

    graph = avfilter_graph_alloc();
    if(graph == NULL)
    {
        printf("Cannot allocate filter graph.");        
        return -1;
    }

    //build the filters string here
    // ...

    returnCode = avfilter_graph_parse2(graph, filters, &gis, &gos);
    if(returnCode < 0)
    {
        cs_printAVError("Cannot parse graph.", returnCode);
        return returnCode;
    }

    returnCode = avfilter_graph_config(graph, NULL);
    if(returnCode < 0)
    {
        cs_printAVError("Cannot configure graph.", returnCode);
        return returnCode;
    }

    //get the filter contexts from the graph here

    return 0;
}
like image 106
gookman Avatar answered Nov 09 '22 21:11

gookman


I cant add a comment so i would just like to add you can fix "Output pad "default" with type video of the filter instance "background" of buffer not connected to any destination" by not having a sink at all. The filter will automatically make the sink for you. So you are adding too many pads

like image 29
user1546060 Avatar answered Nov 09 '22 22:11

user1546060


For my case I had a transformation like this:

[0:v]pad=1008:734:144:0:black[pad];[pad][1:v]overlay=0:576[out]

If you try ffmpeg from command line, it will work:

ffmpeg -i first.mp4 -i second.mp4 -filter_complex "[0:v]pad=1008:734:144:0:black[pad];[pad][1:v]overlay=0:576[out]" -map "[out]" -map 0:a output.mp4

Basically, increasing the overall size of first video, then overlapping the second one. After a long try, same problems as this thread, I got it working. The video filtering example from FFMPEG documentation (https://ffmpeg.org/doxygen/2.1/doc_2examples_2filtering_video_8c-example.html) works fine, and after digging into it, this went fine:

    filterGraph = avfilter_graph_alloc();
    NULLC(filterGraph);

    bufferSink = avfilter_get_by_name("buffersink");
    NULLC(bufferSink);
    filterInput = avfilter_inout_alloc();
    AVBufferSinkParams* buffersinkParams = av_buffersink_params_alloc();
    buffersinkParams->pixel_fmts = pixelFormats;

    FFMPEGHRC(avfilter_graph_create_filter(&bufferSinkContext, bufferSink, "out", NULL, buffersinkParams, filterGraph));

    av_free(buffersinkParams);

    filterInput->name = av_strdup("out");
    filterInput->filter_ctx = bufferSinkContext;
    filterInput->pad_idx = 0;
    filterInput->next = NULL;

    filterOutputs = new AVFilterInOut*[inputFiles.size()];
    ZeroMemory(filterOutputs, sizeof(AVFilterInOut*) * inputFiles.size());
    bufferSourceContext = new AVFilterContext*[inputFiles.size()];
    ZeroMemory(bufferSourceContext, sizeof(AVFilterContext*) * inputFiles.size());

    for (i = inputFiles.size() - 1; i >= 0 ; i--)
    {
        snprintf(args, sizeof(args), "video_size=%dx%d:pix_fmt=%d:time_base=%d/%d:pixel_aspect=%d/%d", 
            videoCodecContext[i]->width, videoCodecContext[i]->height, videoCodecContext[i]->pix_fmt, videoCodecContext[i]->time_base.num, videoCodecContext[i]->time_base.den, videoCodecContext[i]->sample_aspect_ratio.num, videoCodecContext[i]->sample_aspect_ratio.den);

        filterOutputs[i] = avfilter_inout_alloc();
        NULLC(filterOutputs[i]);
        bufferSource = avfilter_get_by_name("buffer");
        NULLC(bufferSource);
        sprintf(args2, outputTemplate, i);
        FFMPEGHRC(avfilter_graph_create_filter(&bufferSourceContext[i], bufferSource, "in", args, NULL, filterGraph));

        filterOutputs[i]->name = av_strdup(args2);
        filterOutputs[i]->filter_ctx = bufferSourceContext[i];
        filterOutputs[i]->pad_idx = 0;
        filterOutputs[i]->next = i < inputFiles.size() - 1 ? filterOutputs[i + 1] : NULL;
    }

    FFMPEGHRC(avfilter_graph_parse_ptr(filterGraph, description, &filterInput, filterOutputs, NULL));
    FFMPEGHRC(avfilter_graph_config(filterGraph, NULL));

The type of variables are the same as in example above, the args and args2 are char[512], where outputTemplate is "%d:v", basically the input video IDs from filtering expression. Couple of things to watch-out:

  • The video information in args, needs to be correct, time_base and sample_aspect_ration are copied from the video stream of format context.
  • Indeed the input, is what is for us output, and the other way around
  • The name of the filter is "in" for all our input filters(filterOutputs)
like image 42
Stefan Pintilie Avatar answered Nov 09 '22 20:11

Stefan Pintilie