Using ffmpeg to capture frames from webcam and audio from micro and saving to file

Tags:

For the past few weeks I've been struggling with the ffmpeg API since I can not find a clear documentation and I also find it hard to search as all the solutions I find online involve not the c API but the ffmpeg.c command line program. I am creating a program which needs to capture video from a webcam and audio, show the frames on screen and record both the audio and frames to a video file. I am also using QT as a framework for this project.

I've been able to show the frames on the screen and even record them, but my problem is the record of both the audio and video. I've decided to create a simpler program for tests, that only saves the stream to a file without showing the frames on screen, starting from the remuxing.c example on the ffmpeg documentation. My code is as follows:

//This is the variables on the .h
AVOutputFormat *ofmt;
AVFormatContext *ifmt_ctx, *ofmt_ctx;

QString cDeviceName;
QString aDeviceName;

int audioStream, videoStream;
bool done;

//The .cpp
#include "cameratest.h"
#include <QtConcurrent/QtConcurrent>
#include <QDebug>

CameraTest::CameraTest(QString cDeviceName, QString aDeviceName, QObject *parent) :
    QObject(parent)
{
    done = false;
    this->cDeviceName = cDeviceName;
    this->aDeviceName = aDeviceName;
    av_register_all();
    avdevice_register_all();
}

void CameraTest::toggleDone() {
    done = !done;
}

int CameraTest::init() {
    ofmt = NULL;
    ifmt_ctx = NULL;
    ofmt_ctx = NULL;

    QString fullDName = cDeviceName.prepend("video=") + ":" + aDeviceName.prepend("audio="); 
    qDebug() << fullDName;
    AVInputFormat *fmt = av_find_input_format("dshow");

    int ret, i;

    if (avformat_open_input(&ifmt_ctx, fullDName.toUtf8().data(), fmt, NULL) < 0) {
       fprintf(stderr, "Could not open input file '%s'", fullDName.toUtf8().data());
       return -1;
    }
    if ((ret = avformat_find_stream_info(ifmt_ctx, 0)) < 0) {
       fprintf(stderr, "Failed to retrieve input stream information");
       return -1;
    }
    av_dump_format(ifmt_ctx, 0, fullDName.toUtf8().data(), 0);
    avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, "test.avi");
    if (!ofmt_ctx) {
       fprintf(stderr, "Could not create output context\n");
       ret = AVERROR_UNKNOWN;
       return -1;
    }
    ofmt = ofmt_ctx->oformat;

    for (i = 0; i < ifmt_ctx->nb_streams; i++) {
       AVStream *in_stream = ifmt_ctx->streams[i];
       AVStream *out_stream = avformat_new_stream(ofmt_ctx, in_stream->codec->codec);

       if (ifmt_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
           videoStream = i;
       }
       else if (ifmt_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
           audioStream = i;
       }

       if (!out_stream) {
           fprintf(stderr, "Failed allocating output stream\n");
           ret = AVERROR_UNKNOWN;
           return -1;
       }
       ret = avcodec_copy_context(out_stream->codec, in_stream->codec);
       if (ret < 0) {
           fprintf(stderr, "Failed to copy context from input to output stream codec context\n");
           return -1;
       }
       out_stream->codec->codec_tag = 0;
       if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER)
           out_stream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
    }
    av_dump_format(ofmt_ctx, 0, "test.avi", 1);
    if (!(ofmt->flags & AVFMT_NOFILE)) {
       ret = avio_open(&ofmt_ctx->pb, "test.avi", AVIO_FLAG_WRITE);
       if (ret < 0) {
           fprintf(stderr, "Could not open output file '%s'", "test.avi");
           return -1;
       }
    }
    ret = avformat_write_header(ofmt_ctx, NULL);
    if (ret < 0) {
       fprintf(stderr, "Error occurred when opening output file\n");
       return -1;
    }
    QtConcurrent::run(this, &CameraTest::grabFrames);
    return 0;
}

void CameraTest::grabFrames() {
    AVPacket pkt;
    int ret;
    while (av_read_frame(ifmt_ctx, &pkt) >= 0) {
        AVStream *in_stream, *out_stream;
        in_stream  = ifmt_ctx->streams[pkt.stream_index];
        out_stream = ofmt_ctx->streams[pkt.stream_index];
        /* copy packet */
        pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream->time_base, out_stream->time_base, (AVRounding) (AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
        pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream->time_base, out_stream->time_base, (AVRounding) (AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
        pkt.duration = av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);
        pkt.pos = -1;
        int ret = av_interleaved_write_frame(ofmt_ctx, &pkt);
        if (ret < 0) {
           qDebug() << "Error muxing packet";
           //break;
        }
        av_free_packet(&pkt);

        if(done) break;
    }
    av_write_trailer(ofmt_ctx);

    avformat_close_input(&ifmt_ctx);
    /* close output */
    if (ofmt_ctx && !(ofmt->flags & AVFMT_NOFILE))
       avio_close(ofmt_ctx->pb);
    avformat_free_context(ofmt_ctx);
    if (ret < 0 && ret != AVERROR_EOF) {
        //return -1;
       //fprintf(stderr, "Error occurred: %s\n", av_err2str(ret));
    }
}

The av_interleaved_write_frame returns an error with the video packets. The end file shows only the first frame but the audio seems to be ok.

On the console this is what is printed:

Input #0, dshow, from 'video=Integrated Camera:audio=Microfone interno (Conexant 206':
  Duration: N/A, start: 146544.738000, bitrate: 1411 kb/s
    Stream #0:0: Video: rawvideo, bgr24, 640x480, 30 tbr, 10000k tbn, 30 tbc
    Stream #0:1: Audio: pcm_s16le, 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, avi, to 'test.avi':
    Stream #0:0: Video: rawvideo, bgr24, 640x480, q=2-31, 30 tbc
    Stream #0:1: Audio: pcm_s16le, 44100 Hz, 2 channels, s16, 1411 kb/s

[avi @ 0089f660] Using AVStream.codec.time_base as a timebase hint to the muxer is deprecated. Set AVStream.time_base instead.
[avi @ 0089f660] Using AVStream.codec.time_base as a timebase hint to the muxer is deprecated. Set AVStream.time_base instead.
[avi @ 0089f660] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 4396365 >= 4396365
[avi @ 0089f660] Too large number of skipped frames 4396359 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396360 > 60000
[avi @ 0089f660] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 4396390 >= 4396390
[avi @ 0089f660] Too large number of skipped frames 4396361 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396362 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396364 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396365 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396366 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396367 > 60000

This seems to me like a simple problem to solve but I really am mostly clueless about the ffmpeg API, if someone could lead me to the right direction that would be great!

Thanks!

672

asked Dec 13 '14 20:12

Solidus

1 Answers

Your problem seems to be somewhat specific to DirectShow. Unfortunately I don't have access to a system with DirectShow, but from the symptom it looks like the capture is not your problem. What is wrong is the muxing part. May be the format of the video packets is not directly supported in AVI, or may be the timestamps on the packets are broken.

I will recommend a few things that you should try, one at a time:

Try using av_write_frame instead of av_interleaved_write_frame.
Use a better container, like MP4 or MKV.
Do not try to mux the input packet to an avi file. In grabFrames take the raw video packets and dump them into a file. That should give you a file that is playable by ffplay. (You will probably have to specify resolution, pixel format and format in your ffplay command.)
Did the above result in a playable video file? If yes then I'd recommend that you decode the individual video packets, convert the colorspace and encode them using a common codec. (I recommend yuv420p in h264.) FFmpeg code base have two examples which should be useful - demuxing_decoding.c and decoding_encoding.c. That should give you a proper video file. (Playable in most players.)

I don't know anything about DirectShow, and I don't know your use case. So my recommendations focus on FFmpeg API. Some of it may be overkill / may not do what you want.

answered Sep 19 '22 11:09

Shakkhar

Related questions
                            
                                When using typeid on a polymorphic object, must it be defined?
                            
                                How to pass 1 column of a 2D matrix to a function in C/C++
                            
                                How does RTTI work?
                            
                                "Magic static" singleton crashing when referenced in static destruction phase of another translation unit
                            
                                C++11 internal std::string representation (libstdc++)
                            
                                How to avoid downcasting while having interface and base classes?
                            
                                g++4.9 bug in allowing std::vector<C_type_array>
                            
                                How can I read content (http response body) from a QNetworkReply
                            
                                Reading an embedded text file resource Visual Studio C++
                            
                                Boost Library and closest points
                            
                                How to do 3D Gaussian filtering in OpenCV? [duplicate]
                            
                                How do you get the raw descriptor data from a USB HID device in Windows?
                            
                                CMake "clang++ is not able compile a simple test program" (Fedora 20)
                            
                                Reading with setw: to eof or not to eof?
                            
                                How can I use gdb to debug code assembled using yasm?
                            
                                Is it legal to activate nested unions via the address of their members?
                            
                                Terminate Qt Process: What's Windows Task Manager doing that I'm not?
                            
                                Is it a bug of design of OpenCV's function "pyrDown"
                            
                                Should a class-member using-declaration with a dependent qualified-id be a dependent name?
                            
                                Explicit template function specializations with overloads: Why would you do it?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using ffmpeg to capture frames from webcam and audio from micro and saving to file

Tags:

c++

c

video

ffmpeg

qt

Solidus

People also ask

1 Answers

Shakkhar

Recent Activity

Donate For Us