I am taking a MediaStream and merging two separate tracks (video and audio) using a canvas and the WebAudio API. The MediaStream itself does not seem to fall out of sync, but after reading it into a MediaRecorder and buffering it into a video element the audio will always seem to play much earlier than the video Here's the code that seems to have the issue:
let stream = new MediaStream();
// Get the mixed sources drawn to the canvas
this.canvas.captureStream().getVideoTracks().forEach(track => {
stream.addTrack(track);
});
// Add mixed audio tracks to the stream
// https://stackoverflow.com/questions/42138545/webrtc-mix-local-and-remote-audio-steams-and-record
this.audioMixer.dest.stream.getAudioTracks().forEach(track => {
stream.addTrack(track);
});
// stream = stream;
let mediaRecorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=opus,vp8' });
let mediaSource = new MediaSource();
let video = document.createElement('video');
video.src = URL.createObjectURL(mediaSource);
document.body.appendChild(video);
video.controls = true;
video.autoplay = true;
// Source open
mediaSource.onsourceopen = () => {
let sourceBuffer = mediaSource.addSourceBuffer(mediaRecorder.mimeType);
mediaRecorder.ondataavailable = (event) => {
if (event.data.size > 0) {
const reader = new FileReader();
reader.readAsArrayBuffer(event.data);
reader.onloadend = () => {
sourceBuffer.appendBuffer(reader.result);
console.log(mediaSource.sourceBuffers);
console.log(event.data);
}
}
}
mediaRecorder.start(1000);
}
AudioMixer.js
export default class AudioMixer {
constructor() {
// Initialize an audio context
this.audioContext = new AudioContext();
// Destination outputs one track of mixed audio
this.dest = this.audioContext.createMediaStreamDestination();
// Array of current streams in mixer
this.sources = [];
}
// Add an audio stream to the mixer
addStream(id, stream) {
// Get the audio tracks from the stream and add them to the mixer
let sources = stream.getAudioTracks().map(track => this.audioContext.createMediaStreamSource(new MediaStream([track])));
sources.forEach(source => {
// Add it to the current sources being mixed
this.sources.push(source);
source.connect(this.dest);
// Connect to analyser to update volume slider
let analyser = this.audioContext.createAnalyser();
source.connect(analyser);
...
});
}
// Remove all current sources from the mixer
flushAll() {
this.sources.forEach(source => {
source.disconnect(this.dest);
});
this.sources = [];
}
// Clean up the audio context for the mixer
cleanup() {
this.audioContext.close();
}
}
I assume it has to do with how the data is pushed into the MediaSource buffer but I'm not sure. What am I doing that de-syncs the stream?
A late reply to an old post, but it might help someone ...
I had exactly the same problem: I have a video stream, which should be supplemented by an audio stream. In the audio stream short sounds (AudioBuffer) are played from time to time. The whole thing is recorded via MediaRecorder. Everything works fine on Chrome. But on Chrome for Android, all sounds were played back in quick succession. The "when" parameter for "play()" was ignored on Android. (audiocontext.currentTime continued to increase over time ... - that was not the point).
My solution is similar to Jacob's comment Sep 2 '18 at 7:41: I created and connected a sine wave oscillator with inaudible 48,000 Hz, which played permanently in the audio stream during recording. Apparently this leads to the proper time progress.
An RTP endpoint that is emitting multiple related RTP streams that require synchronization at the other endpoint(s) MUST use the same RTCP CNAME for all streams that are to be synchronized. This requires a short-term persistent RTCP CNAME that is common across several RTP streams, and potentially across several related RTP sessions. A common example of such use occurs when lip-syncing audio and video streams in a multimedia session, where a single participant has to use the same RTCP CNAME for its audio RTP session and for its video RTP session. Another example might be to synchronize the layers of a layered audio codec, where the same RTCP CNAME has to be used for each layer.
https://datatracker.ietf.org/doc/html/rfc6222#page-2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With