How to mux (merge) video&audio, so that the audio will loop in the output video in case it's too short in duration?

Background

I'm required to merge a video file and an audio file to a single video file, so that:

The output video file will of the same duration as the input video file
The audio in the output file will only be of the input audio file. If it's too short, it will loop to the end (can stop in the end if needed). This means that once the audio has finished playing while the video hasn't , I should play it again and again, till the video ends (concatenation of the audio).

The technical term of this merging operation is called "muxing", as I've read.

As an example, suppose we have an input video of 10 seconds, and an audio file of 4 seconds, the output video would be of 10 seconds (always the same as the input video), and the audio will play 2.5 times (first 2 cover the first 8 seconds, and then 2 seconds out of 4 for the rest) .

The problems

While I have found a solution of how to mux a video and an audio (here), I've come across multiple issues:

I can't figure out how to loop the writing of the audio content when needed. It keeps giving me an error, no matter what I try
The input files must be of specific file formats. Otherwise, it might throw an exception, or (in very rare cases) worse: create a video file that has black content. Even more: Sometimes a '.mkv' file (for example) could be fine, and sometimes it won't be accepted (and both can be played on a video player app).
The current code handles buffers and not real duration. This means that in many cases, I might stop muxing the audio even though I shouldn't, and the output video file will have a shorter audio content , compared to the original, even though the video is long enough.

What I've tried

I tried to make the MediaExtractor of the audio to go to its beginning each time it reached the end, by using:

        if (audioBufferInfo.size < 0) {
            Log.d("AppLog", "reached end of audio, looping...")
            audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
            audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
        }

For checking the types of the files, I tried using MediaMetadataRetriever and then checking the mime-type. I think the supported ones are available on the docs (here) as those marked with "Encoder". Not sure about this. I also don't know which mime type is of which type that is mentioned there.
I also tried to re-initialize all that's related to the audio, but it didn't work either.

Here's my current code for the muxing itself (full sample project available here) :

object VideoAndAudioMuxer {
    //   based on:  https://stackoverflow.com/a/31591485/878126
    @WorkerThread
    fun joinVideoAndAudio(videoFile: File, audioFile: File, outputFile: File): Boolean {
        try {
            //            val videoMediaMetadataRetriever = MediaMetadataRetriever()
            //            videoMediaMetadataRetriever.setDataSource(videoFile.absolutePath)
            //            val videoDurationInMs =
            //                videoMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION).toLong()
            //            val videoMimeType =
            //                videoMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_MIMETYPE)
            //            val audioMediaMetadataRetriever = MediaMetadataRetriever()
            //            audioMediaMetadataRetriever.setDataSource(audioFile.absolutePath)
            //            val audioDurationInMs =
            //                audioMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION).toLong()
            //            val audioMimeType =
            //                audioMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_MIMETYPE)
            //            Log.d(
            //                "AppLog",
            //                "videoDuration:$videoDurationInMs audioDuration:$audioDurationInMs videoMimeType:$videoMimeType audioMimeType:$audioMimeType"
            //            )
            //            videoMediaMetadataRetriever.release()
            //            audioMediaMetadataRetriever.release()
            outputFile.delete()
            outputFile.createNewFile()
            val muxer = MediaMuxer(outputFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)
            val sampleSize = 256 * 1024
            //video
            val videoExtractor = MediaExtractor()
            videoExtractor.setDataSource(videoFile.absolutePath)
            videoExtractor.selectTrack(0)
            videoExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
            val videoFormat = videoExtractor.getTrackFormat(0)
            val videoTrack = muxer.addTrack(videoFormat)
            val videoBuf = ByteBuffer.allocate(sampleSize)
            val videoBufferInfo = MediaCodec.BufferInfo()
//            Log.d("AppLog", "Video Format $videoFormat")
            //audio
            val audioExtractor = MediaExtractor()
            audioExtractor.setDataSource(audioFile.absolutePath)
            audioExtractor.selectTrack(0)
            audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
            val audioFormat = audioExtractor.getTrackFormat(0)
            val audioTrack = muxer.addTrack(audioFormat)
            val audioBuf = ByteBuffer.allocate(sampleSize)
            val audioBufferInfo = MediaCodec.BufferInfo()
//            Log.d("AppLog", "Audio Format $audioFormat")
            //
            muxer.start()
//            Log.d("AppLog", "muxing video&audio...")
            //            val minimalDurationInMs = Math.min(videoDurationInMs, audioDurationInMs)
            while (true) {
                videoBufferInfo.size = videoExtractor.readSampleData(videoBuf, 0)
                audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
                if (audioBufferInfo.size < 0) {
                    //                    Log.d("AppLog", "reached end of audio, looping...")
                    //TODO somehow start from beginning of the audio again, for looping till the video ends
                    //                    audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                    //                    audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
                }
                if (videoBufferInfo.size < 0 || audioBufferInfo.size < 0) {
//                    Log.d("AppLog", "reached end of video")
                    videoBufferInfo.size = 0
                    audioBufferInfo.size = 0
                    break
                } else {
                    //                    val donePercentage = videoExtractor.sampleTime / minimalDurationInMs / 10L
                    //                    Log.d("AppLog", "$donePercentage")
                    // video muxing
                    videoBufferInfo.presentationTimeUs = videoExtractor.sampleTime
                    videoBufferInfo.flags = videoExtractor.sampleFlags
                    muxer.writeSampleData(videoTrack, videoBuf, videoBufferInfo)
                    videoExtractor.advance()
                    // audio muxing
                    audioBufferInfo.presentationTimeUs = audioExtractor.sampleTime
                    audioBufferInfo.flags = audioExtractor.sampleFlags
                    muxer.writeSampleData(audioTrack, audioBuf, audioBufferInfo)
                    audioExtractor.advance()
                }
            }
            muxer.stop()
            muxer.release()
//            Log.d("AppLog", "success")
            return true
        } catch (e: Exception) {
            e.printStackTrace()
//            Log.d("AppLog", "Error " + e.message)
        }
        return false
    }
}

I've also tried to use FFMPEG libary (here and here) , to see how to do it. It worked fine, but it has some possible issues : The library seems to take a lot of space, annoying licensing terms, and for some reason the sample couldn't play the output file that I got to create, unless I remove something in the command that will make the conversion much slower. I would really prefer to use the built in API than to use this library, even though it's a very powerful library... Also, it seems that for some input files, it didn't loop...

The questions

How can I mux the video&audio files so that the audio will loop in case the audio is shorter (in duration) compared to the video?
How can I do it so that the audio will get cut precisely when the video ends (no remainders on either video&audio) ?
How can I check before calling this function, if the current device can handle the given input files and actually mux them ? Is there a way to check during runtime, which are supported for this kind of operation, instead of relying on a list on the docs that might change in the future?

797

asked Feb 19 '19 15:02

android developer

1 Answers

I hava the same scene.

1: When audioBufferInfo.size < 0, seek to start. But remember, you need accumulate presentationTimeUs.
2: Get video duration, when audio loop to the duration (use presentationTimeUs too), cut.
3: The audio file need to be MediaFormat.MIMETYPE_AUDIO_AMR_NB or MediaFormat.MIMETYPE_AUDIO_AMR_WB or MediaFormat.MIMETYPE_AUDIO_AAC. On my testing machines, it worked fine.

Here is the code:

private fun muxing(musicName: String) {
    val saveFile = File(DirUtils.getPublicMediaPath(), "$saveName.mp4")
    if (saveFile.exists()) {
        saveFile.delete()
        PhotoHelper.sendMediaScannerBroadcast(saveFile)
    }
    try {
        // get the video file duration in microseconds
        val duration = getVideoDuration(mSaveFile!!.absolutePath)

        saveFile.createNewFile()

        val videoExtractor = MediaExtractor()
        videoExtractor.setDataSource(mSaveFile!!.absolutePath)

        val audioExtractor = MediaExtractor()
        val afdd = MucangConfig.getContext().assets.openFd(musicName)
        audioExtractor.setDataSource(afdd.fileDescriptor, afdd.startOffset, afdd.length)

        val muxer = MediaMuxer(saveFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)

        videoExtractor.selectTrack(0)
        val videoFormat = videoExtractor.getTrackFormat(0)
        val videoTrack = muxer.addTrack(videoFormat)

        audioExtractor.selectTrack(0)
        val audioFormat = audioExtractor.getTrackFormat(0)
        val audioTrack = muxer.addTrack(audioFormat)

        var sawEOS = false
        val offset = 100
        val sampleSize = 1000 * 1024
        val videoBuf = ByteBuffer.allocate(sampleSize)
        val audioBuf = ByteBuffer.allocate(sampleSize)
        val videoBufferInfo = MediaCodec.BufferInfo()
        val audioBufferInfo = MediaCodec.BufferInfo()

        videoExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
        audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)

        muxer.start()

        val frameRate = videoFormat.getInteger(MediaFormat.KEY_FRAME_RATE)
        val videoSampleTime = 1000 * 1000 / frameRate

        while (!sawEOS) {
            videoBufferInfo.offset = offset
            videoBufferInfo.size = videoExtractor.readSampleData(videoBuf, offset)

            if (videoBufferInfo.size < 0) {
                sawEOS = true
                videoBufferInfo.size = 0

            } else {
                videoBufferInfo.presentationTimeUs += videoSampleTime
                videoBufferInfo.flags = videoExtractor.sampleFlags
                muxer.writeSampleData(videoTrack, videoBuf, videoBufferInfo)
                videoExtractor.advance()
            }
        }

        var sawEOS2 = false
        var sampleTime = 0L
        while (!sawEOS2) {

            audioBufferInfo.offset = offset
            audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, offset)

            if (audioBufferInfo.presentationTimeUs >= duration) {
                sawEOS2 = true
                audioBufferInfo.size = 0
            } else {
                if (audioBufferInfo.size < 0) {
                    sampleTime = audioBufferInfo.presentationTimeUs
                    audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                    continue
                }
            }
            audioBufferInfo.presentationTimeUs = audioExtractor.sampleTime + sampleTime
            audioBufferInfo.flags = audioExtractor.sampleFlags
            muxer.writeSampleData(audioTrack, audioBuf, audioBufferInfo)
            audioExtractor.advance()
        }

        muxer.stop()
        muxer.release()
        videoExtractor.release()
        audioExtractor.release()
        afdd.close()
    } catch (e: Exception) {
        LogUtils.e(TAG, "Mixer Error:" + e.message)
    }
}

177

answered Oct 15 '22 23:10

lijia

Related questions
                            
                                Trying to run systrace on android device but the resulting html shows an error
                            
                                Divide the screen into SurfaceView and xml layout
                            
                                Failed to find '?attr/textAppearanceCaption' in current theme
                            
                                Moving from SDK 21 to SDK 28
                            
                                workmanager listener called immediately
                            
                                FirebaseFirestore has already been started and its settings can no longer be changed
                            
                                Navigate without creating a new instance of Fragment - Navigation Component
                            
                                Execution failed for task ':app:transformClassesWithDexBuilderForDebugAndroidTest' java.lang.RuntimeException
                            
                                Android.bp: how to add external header .h files
                            
                                Nearby Messages using an IntentService
                            
                                React Native Gradle ImagePicker wont build
                            
                                How to Remove white Background from icon?
                            
                                Android max. size of custom notification
                            
                                Native Code coverage with android soong build system
                            
                                WhatsApp could access SMS without having permission to it
                            
                                Unable to get Gradle wrapper properties
                            
                                AWS S3 Android SDK 2.11.0
                            
                                No services are working on One plus 5T OS android OS 9.0
                            
                                How to handoff animation for Android developers?
                            
                                Android ABI split migrating to App Bundle

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to mux (merge) video&audio, so that the audio will loop in the output video in case it's too short in duration?

Tags:

android

video

audio

mediamuxer