Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

WebRTC video/audio streams out of sync (MediaStream -> MediaRecorder -> MediaSource -> Video Element)

I am taking a MediaStream and merging two separate tracks (video and audio) using a canvas and the WebAudio API. The MediaStream itself does not seem to fall out of sync, but after reading it into a MediaRecorder and buffering it into a video element the audio will always seem to play much earlier than the video Here's the code that seems to have the issue:

let stream = new MediaStream();

// Get the mixed sources drawn to the canvas
this.canvas.captureStream().getVideoTracks().forEach(track => {
  stream.addTrack(track);
});

// Add mixed audio tracks to the stream
// https://stackoverflow.com/questions/42138545/webrtc-mix-local-and-remote-audio-steams-and-record
this.audioMixer.dest.stream.getAudioTracks().forEach(track => {
  stream.addTrack(track);
});

// stream = stream;
let mediaRecorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=opus,vp8' });

let mediaSource = new MediaSource();
let video = document.createElement('video');
video.src = URL.createObjectURL(mediaSource);
document.body.appendChild(video);
video.controls = true;
video.autoplay = true;

// Source open
mediaSource.onsourceopen = () => {
  let sourceBuffer = mediaSource.addSourceBuffer(mediaRecorder.mimeType);

  mediaRecorder.ondataavailable = (event) => {

    if (event.data.size > 0) {
      const reader = new FileReader();
      reader.readAsArrayBuffer(event.data);
      reader.onloadend = () => {
        sourceBuffer.appendBuffer(reader.result);
        console.log(mediaSource.sourceBuffers);
        console.log(event.data);
      }
    }
  }
  mediaRecorder.start(1000);
}

AudioMixer.js

export default class AudioMixer {

  constructor() {
    // Initialize an audio context
    this.audioContext = new AudioContext();

    // Destination outputs one track of mixed audio
    this.dest = this.audioContext.createMediaStreamDestination();

    // Array of current streams in mixer
    this.sources = [];
  }

  // Add an audio stream to the mixer
  addStream(id, stream) {
    // Get the audio tracks from the stream and add them to the mixer
    let sources = stream.getAudioTracks().map(track => this.audioContext.createMediaStreamSource(new MediaStream([track])));
    sources.forEach(source => {

      // Add it to the current sources being mixed
      this.sources.push(source);
      source.connect(this.dest);

      // Connect to analyser to update volume slider
      let analyser = this.audioContext.createAnalyser();
      source.connect(analyser);
      ...
    });
  }

  // Remove all current sources from the mixer
  flushAll() {
    this.sources.forEach(source => {
      source.disconnect(this.dest);
    });

    this.sources = [];
  }

  // Clean up the audio context for the mixer
  cleanup() {
    this.audioContext.close();
  }
}

I assume it has to do with how the data is pushed into the MediaSource buffer but I'm not sure. What am I doing that de-syncs the stream?

like image 958
Jacob Greenway Avatar asked Sep 02 '18 07:09

Jacob Greenway


2 Answers

A late reply to an old post, but it might help someone ...

I had exactly the same problem: I have a video stream, which should be supplemented by an audio stream. In the audio stream short sounds (AudioBuffer) are played from time to time. The whole thing is recorded via MediaRecorder. Everything works fine on Chrome. But on Chrome for Android, all sounds were played back in quick succession. The "when" parameter for "play()" was ignored on Android. (audiocontext.currentTime continued to increase over time ... - that was not the point).

My solution is similar to Jacob's comment Sep 2 '18 at 7:41: I created and connected a sine wave oscillator with inaudible 48,000 Hz, which played permanently in the audio stream during recording. Apparently this leads to the proper time progress.

like image 60
Martin Luckow Avatar answered Sep 21 '22 09:09

Martin Luckow


An RTP endpoint that is emitting multiple related RTP streams that require synchronization at the other endpoint(s) MUST use the same RTCP CNAME for all streams that are to be synchronized. This requires a short-term persistent RTCP CNAME that is common across several RTP streams, and potentially across several related RTP sessions. A common example of such use occurs when lip-syncing audio and video streams in a multimedia session, where a single participant has to use the same RTCP CNAME for its audio RTP session and for its video RTP session. Another example might be to synchronize the layers of a layered audio codec, where the same RTCP CNAME has to be used for each layer.

https://datatracker.ietf.org/doc/html/rfc6222#page-2

like image 30
Usama Avatar answered Sep 21 '22 09:09

Usama