Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Xuggler encoding and muxing

I'm trying to use Xuggler (which I believe uses ffmpeg under the hood) to do the following:

  • Accept a raw MPJPEG video bitstream (from a small TTL serial camera) and encode/transcode it to h.264; and
  • Accept a raw audio bitsream (from a microphone) and encode it to AAC; then
  • Mux the two (audio and video) bitsreams together into a MPEG-TS container

I've watched/read some of their excellent tutorials, and so far here's what I've got:

// I'll worry about implementing this functionality later, but
// involves querying native device drivers.
byte[] nextMjpeg = getNextMjpegFromSerialPort();

// I'll also worry about implementing this functionality as well;
// I'm simply providing these for thoroughness.
BufferedImage mjpeg = MjpegFactory.newMjpeg(nextMjpeg);

// Specify a h.264 video stream (how?)
String h264Stream = "???";

IMediaWriter writer = ToolFactory.makeWriter(h264Stream);
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264);
writer.encodeVideo(0, mjpeg);

For one, I think I'm close here, but it's still not correct; and I've only gotten this far by reading the video code examples (not the audio - I can't find any good audio examples).

Literally, I'll be getting byte-level access to the raw video and audio feeds coming into my Xuggler implementation. But for the life of me I can't figure out how to get them into an h.264/AAC/MPEG-TS format. Thanks in advance for any help here.

like image 844
IAmYourFaja Avatar asked Dec 12 '12 12:12

IAmYourFaja


1 Answers

Looking at Xuggler this sample code, the following should work to encode video as H.264 and mux it into a MPEG2TS container:

IMediaWriter writer = ToolFactory.makeWriter("output.ts");
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264, width, height);
for (...)
{

   BufferedImage mjpeg = ...;

   writer.encodeVideo(0, mjpeg);
}

The container type is guessed from the file extension, the codec is specified explicitly.

To mux audio and video, you would do something like this:

writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);

while (... have more data ...)
{
    BufferedImage videoFrame = ...;
    long videoFrameTime = ...; // this is the time to display this frame
    writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);

    short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
    long audioSamplesTime = ...; // this is the time to play back this bit of audio
    writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}

In this case I believe your code is responsible for interleaving the audio and video: you want to call either encodeAudio() or encodeVideo() on each pass through the loop, based on which data available (a chunk of audio samples or a video frame) has an earlier timestamp.

There is another, lower-level API you may end up using, based on IStreamCoder, which gives more control over various parameters. I don't think you will need to use that.

To answer the specific questions you asked:

(1) "Encode a BufferedImage (M/JPEG) into a h.264 stream" - you already figured that out, writer.addVideoStream(..., ICodec.ID.CODEC_ID_H264) makes sure you get the H.264 codec. To get a transport stream (MPEG2 TS) container, simply call makeWriter() with a filename with a .ts extension.

(2) "Figure out what the "BufferedImage-equivalent" for a raw audio feed is" - that is either a short[] or an IAudioSamples object (both seem to work, but IAudioSamples has to be constructed from an IBuffer which is much less straightforward).

(3) "Encode this audio class into an AAC audio stream" - call writer.addAudioStream(..., ICodec.ID.CODEC_ID_AAC, channelCount, sampleRate)

(4) "multiplex both stream into the same MPEG-TS container" - call makeWriter() with a .ts filename, which sets the container type. For correct audio/video sync you probably need to call encodeVideo()/encodeAudio() in the correct order.

P.S. Always pass the earliest audio/video available first. For example, if you have audio chunks which are 440 samples long (at 44000 Hz sample rate, 440 / 44000 = 0.01 seconds), and video at exactly 25fps (1 / 25 = 0.04 seconds), you would give them to the writer in this order:

video0 @ 0.00 sec
audio0 @ 0.00 sec
audio1 @ 0.01 sec
audio2 @ 0.02 sec
audio3 @ 0.03 sec
video1 @ 0.04 sec
audio4 @ 0.04 sec
audio5 @ 0.05 sec

... and so forth

Most playback devices are probably ok with the stream as long as the consecutive audio/video timestamps are relatively close, but this is what you'd do for a perfect mux.

P.S. There are a few docs you may want to refer to: Xuggler class diagram, ToolFactory, IMediaWriter, ICodec.

like image 95
Alex I Avatar answered Oct 21 '22 11:10

Alex I