I am trying to combine two audio files, and delaying the second one. Here's my command
ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka -i RT103bfe5f4b129860f69cd8e820f3a10b.mka -filter_complex "[1:a]adelay=13500s:all=1[apad]; [0:a][apad]amix=inputs=2:weights=1|1[aout]" -map [aout] combined_audio.mka
Here is the output that i'm getting, and it's causing an issue where the second audio is delayed by 5 hours and 45 minutes rather than 3 hours and 45 minutes
ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka -i RT103bfe5f4b129860f69cd8e820f3a10b.mka -filter_complex "[1:a]adelay=13500s:all=1[apad]; [0:a][apad]amix=inputs=2:weights=1|1[aout]" -map [aout] combined_audio.mka
ffmpeg version n5.0-4-g911d7f167c-20220311 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 11.2.0 (crosstool-NG 1.24.0.533_681aaef)
configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librist --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20220311
libavutil 57. 17.100 / 57. 17.100
libavcodec 59. 18.100 / 59. 18.100
libavformat 59. 16.100 / 59. 16.100
libavdevice 59. 4.100 / 59. 4.100
libavfilter 8. 24.100 / 8. 24.100
libswscale 6. 4.100 / 6. 4.100
libswresample 4. 3.100 / 4. 3.100
libpostproc 56. 3.100 / 56. 3.100
Input #0, matroska,webm, from 'RTb295d0534191e1acb22a45bb971a12e6.mka':
Metadata:
encoder : GStreamer matroskamux version 1.16.2
creation_time : 2022-03-23T21:20:27.000000Z
Duration: 03:45:00.47, start: 0.291000, bitrate: 19 kb/s
Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Metadata:
title : Audio
Input #1, matroska,webm, from 'RT103bfe5f4b129860f69cd8e820f3a10b.mka':
Metadata:
encoder : GStreamer matroskamux version 1.16.2
creation_time : 2022-03-24T01:05:30.000000Z
Duration: 02:45:03.51, start: 13502.587000, bitrate: 5 kb/s
Stream #1:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Metadata:
title : Audio
Stream mapping:
Stream #0:0 (opus) -> amix
Stream #1:0 (opus) -> adelay:default
amix:default -> Stream #0:0 (libvorbis)
Press [q] to stop, [?] for help
Output #0, matroska, to 'combined_audio.mka':
Metadata:
encoder : Lavf59.16.100
Stream #0:0: Audio: vorbis (oV[0][0] / 0x566F), 48000 Hz, stereo, fltp
Metadata:
encoder : Lavc59.18.100 libvorbis
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time231x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time184x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time189x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time223x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time275x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time245x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time213x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time209x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time208x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time204x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time199x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time193x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time185x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time181x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time178x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time177x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time176x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time169x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time167x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time163x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time146x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time139x
size= 75141kB time=06:07:52.57 bitrate= 27.9kbits/s speed= 130x
video:0kB audio:70470kB subtitle:0kB other streams:0kB global headers:4kB muxing overhead: 6.628071%
The audio files being mixed together - https://www.easyupload.io/m/durisk
How can i resolve this issue?
The fundamental issue in these audio files appears to be the frequently dropped frames (each containing 960 audio samples). There is an instance of 8117 seconds gap between 2 successive frames in the first file. Because the MKA files were formed without filling these dropped frames, they are effectively variable-sampling-rate streams while labeled as constant-sampling-rate. This discrepancy makes your audios to appear shorter than they were recorded, explaining why your output is often much longer than expected and has been wrecking havoc on your attempt to work on these files.
While atm I do not know if FFmpeg offers a mechanism to fix/estimate the dropped frames in these files, yYou can brute-force/ignore the dropped frames by:
amix
ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka \
-i RT103bfe5f4b129860f69cd8e820f3a10b.mka \
-filter_complex "[1:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB,adelay=13500s:all=1[apad]; \
[0:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB,[apad]amix=inputs=2:weights=1|1[aout]" \
-map [aout] combined_audio.mka
concat
It appears your filtergraph really just concatenates 2 streams by delaying the second by the duration of the first. You can do the following instead:
ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka \
-i RT103bfe5f4b129860f69cd8e820f3a10b.mka \
-filter_complex "[1:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB,[0a]concat=2:0:1[aout]; \
[0:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB[0a]" \
-map [aout] combined_audio.mka
explanation
asetpts filters are used to completely ignore what the files say about the time of the frames being fed to the filtergraph and recompute the new PTS of each frame by using the following variables and formula:
NB_CONSUMED_SAMPLES number of samples processedSR: sampling rate (samples/second)TB: timebase (second)NB_CONSUMED_SAMPLES/SR/TB: new PTS (starting TB block index)video streams
If your video files have the same issue, you can likewise use the setpts filter:
setpts=N/FR/TB
filling missing frames
aresample can be used to fill the missing frames with zero-valued samples (silence). (a reference post. You can give it a try to see what happens to each mka file:
ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka \
-af "aresample=async=1 \
patched_audio.mka
Note: This will make the stream longer possibly with (low-pitched?) buzzing/beeping, making it unlistenable. But, you may need to do this to make syncing them to videos. The resampler can stretch surrounding samples, so that might be a solution for you. See the documentation for async, min_comp, min_hard_comp, comp_duration, and max_soft_comp options.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With