I'm writing an APP that can encode video by camera input and process video by decode-edit-encode steps. For the camera, I use the Camera class rather than Intent to configure the details settings of the camera. Then I feed the camera frames to the encoder (MediaCodec in API 16) and the muxer (I use ffmpeg muxer since I want to work on 4.1 Devices).
I measure the time code of camera frames by system nano time, and select a subset of frames to fit a desired FPS (currently 15). There are some small "noises" in the time values, for example (in ms): 0, 60718, 135246, 201049, ... rather than 0, 66000, 133000, 200000, ... .
After some trying of configuring the muxer correctly (As this question), I can produce a video (with AVC codec) that can be playback by the video player on devices. The playback speed is correct so I think the video should have correct time information of the frames.
However, I got a problem when I try to decode the video to perform the video editting process. I use the standard video extract/decode steps as these samples, like this:
int decode_input_index = decoder.dequeueInputBuffer(TIMEOUT_USEC);
if (decode_input_index >= 0)
{
ByteBuffer decoder_input_buffer = decode_input_buffers[decode_input_index];
int sample_size = extractor.readSampleData(decoder_input_buffer, 0);
if (sample_size < 0)
{
decoder.queueInputBuffer(decode_input_index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
is_decode_input_done = true;
}
else
{
long sample_time = extractor.getSampleTime();
decoder.queueInputBuffer(decode_input_index, 0, sample_size, sample_time, 0);
extractor.advance();
}
}
else
{
Log.v(TAG, "Decoder dequeueInputBuffer timed out! Try again later");
}
The sample time from getSampleTime() has correct value as I encode the video. (e.g., they are exactly 0, 60718, 135246, 201049, ... in us). It is also the presentation time in the input of decoder.queueInputBuffer(). When the decoder proceeds to decode this frame, I get the frame time by:
int decode_output_index = decoder.dequeueOutputBuffer(decode_buffer_info, TIMEOUT_USEC);
switch (decode_output_index)
{
....
(some negative-value flags in MediaCodec)
....
default:
{
ByteBuffer decode_output_buffer = decode_output_buffers[decode_output_index];
long ptime_us = decode_buffer_info.presentationTimeUs;
boolean is_decode_EOS = ((decode_buffer_info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0);
....
}
}
I expect to set the same time sequence as the one in decoder input, but I get a lot of 0's from the BufferInfo at decoder output. The decoded frame contents seems to be correct, but most of the presentation time values are 0. Only the last few frames has correct presentation time.
I test the whole same process on a device with Android 4.3 (even with the same ffmpeg muxer rather than MediaMuxer in API 18), and everything looks fine. On 4.1/4.2 devices, if I capture the video by the built-in camera APP on the device and then decode the video, then the presentation time is also correct although the time values also have noises due to camera delay.
What's wrong with the video or the decode process, when the video can be playbacked and decoded normally, but with correct sample time and bad presentation time? I may have to use a workaround to measure the presentation time by the sample time (It's easy by using a queue), but I want to figure out if there is any missing part in my work.
There is no guarantee that MediaCodec
handles presentation time stamps correctly before Android 4.3. This is because the CTS tests that confirm the PTS behavior were not added until then.
I do recall that there were problems with the timestamp handling in the AVC codecs from certain vendors. I don't remember the details offhand, but if you run the buffer-to-buffer and buffer-to-surface tests from EncodeDecodeTest on a variety of 4.1/4.2 devices you'll turn up some failures. (You'd need to strip out the surface-to-surface tests, of course.)
Your timestamp handling code looks fine. The timestamp isn't part of the H.264 stream, so it's really just being forwarded through the codec as metadata, and you seem to be picking it up and forwarding it on in all the right places. The bottom line is, if you're passing valid PTS values in and getting good video but garbage PTS values, something in the codec is mishandling them.
You'll need to work around it by passing the values separately, or -- if the input frame rate is always regular -- trivially computing it. In theory the encoder can reorder frames, so the order in which you pass time stamps into the encoder might not be the same order in which they come out... but since you know the timestamps were ascending when you made the movie, you should be able to just sort them if this were a problem in practice.
On a different note, delays in the system will cause the "wobble" you're seeing in the timestamp values if you grab System.nanoTime()
when the frame arrives in the app. You can do a bit better in Android 4.3 with Surface input because the SurfaceTexture
holds a timestamp that is set much closer to when the frame was captured. (I know that's not useful for your current efforts, but wanted to give some hope for the future.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With