I'm trying to use Xuggler (which I believe uses ffmpeg
under the hood) to do the following:
I've watched/read some of their excellent tutorials, and so far here's what I've got:
// I'll worry about implementing this functionality later, but
// involves querying native device drivers.
byte[] nextMjpeg = getNextMjpegFromSerialPort();
// I'll also worry about implementing this functionality as well;
// I'm simply providing these for thoroughness.
BufferedImage mjpeg = MjpegFactory.newMjpeg(nextMjpeg);
// Specify a h.264 video stream (how?)
String h264Stream = "???";
IMediaWriter writer = ToolFactory.makeWriter(h264Stream);
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264);
writer.encodeVideo(0, mjpeg);
For one, I think I'm close here, but it's still not correct; and I've only gotten this far by reading the video code examples (not the audio - I can't find any good audio examples).
Literally, I'll be getting byte-level access to the raw video and audio feeds coming into my Xuggler implementation. But for the life of me I can't figure out how to get them into an h.264/AAC/MPEG-TS format. Thanks in advance for any help here.
Looking at Xuggler this sample code, the following should work to encode video as H.264 and mux it into a MPEG2TS container:
IMediaWriter writer = ToolFactory.makeWriter("output.ts");
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264, width, height);
for (...)
{
BufferedImage mjpeg = ...;
writer.encodeVideo(0, mjpeg);
}
The container type is guessed from the file extension, the codec is specified explicitly.
To mux audio and video, you would do something like this:
writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);
while (... have more data ...)
{
BufferedImage videoFrame = ...;
long videoFrameTime = ...; // this is the time to display this frame
writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);
short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
long audioSamplesTime = ...; // this is the time to play back this bit of audio
writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}
In this case I believe your code is responsible for interleaving the audio and video: you want to call either encodeAudio() or encodeVideo() on each pass through the loop, based on which data available (a chunk of audio samples or a video frame) has an earlier timestamp.
There is another, lower-level API you may end up using, based on IStreamCoder, which gives more control over various parameters. I don't think you will need to use that.
To answer the specific questions you asked:
(1) "Encode a BufferedImage (M/JPEG) into a h.264 stream" - you already figured that out, writer.addVideoStream(..., ICodec.ID.CODEC_ID_H264)
makes sure you get the H.264 codec. To get a transport stream (MPEG2 TS) container, simply call makeWriter()
with a filename with a .ts extension.
(2) "Figure out what the "BufferedImage-equivalent" for a raw audio feed is" - that is either a short[] or an IAudioSamples object (both seem to work, but IAudioSamples has to be constructed from an IBuffer which is much less straightforward).
(3) "Encode this audio class into an AAC audio stream" - call writer.addAudioStream(..., ICodec.ID.CODEC_ID_AAC, channelCount, sampleRate)
(4) "multiplex both stream into the same MPEG-TS container" - call makeWriter()
with a .ts filename, which sets the container type. For correct audio/video sync you probably need to call encodeVideo()/encodeAudio() in the correct order.
P.S. Always pass the earliest audio/video available first. For example, if you have audio chunks which are 440 samples long (at 44000 Hz sample rate, 440 / 44000 = 0.01 seconds), and video at exactly 25fps (1 / 25 = 0.04 seconds), you would give them to the writer in this order:
video0 @ 0.00 sec
audio0 @ 0.00 sec
audio1 @ 0.01 sec
audio2 @ 0.02 sec
audio3 @ 0.03 sec
video1 @ 0.04 sec
audio4 @ 0.04 sec
audio5 @ 0.05 sec
... and so forth
Most playback devices are probably ok with the stream as long as the consecutive audio/video timestamps are relatively close, but this is what you'd do for a perfect mux.
P.S. There are a few docs you may want to refer to: Xuggler class diagram, ToolFactory, IMediaWriter, ICodec.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With