Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Meet: WebRTC peer-to-peer and Speech to Text

Tags:

webrtc

I was in a meeting on Google Meet and saw that you could turn on real time subtitles. They've actually got a demo here on how realtime speech to text can be done, so that bit doesn't confuse me.

I had also been wanting to experiment with WebRTC (which I believe GoogleMeet uses) just to see its capabilities - e.g. the ability to share a screen without any additional screens.

However, I've always been under the impression that a WebRTC video/audio stream is client peer-to-peer. The questions I have therefore are

  • How then are Google able to send the audio stream off to a server for analysis?
  • Is it possible to send the audio stream to the client as well as to a server?
  • Would you have to create two of the same audio stream (i don't know if this is even possible), send one over WebRTC to the other peer(s) and the other to a server for analysis?

How do they achieve this - and if they don't use WebRTC, is it possible to achieve this with WebRTC?

like image 435
Luke Madhanga Avatar asked Mar 26 '20 21:03

Luke Madhanga


People also ask

Is Google Meet uses WebRTC?

To date, Google Meet (or Hangouts), is a massive application that makes use of WebRTC.

Does Google Meet peer-to-peer?

Google Meet uses “peer to peer” connections for calls with 2 participants. Note: Peer-to-peer connections are used only for calls with two participants. If any additional participant joins, Hangouts will immediately switch to sending and receiving data using the connection to the Google server.

How does Google Meet WebRTC work?

Utilizes a user's camera and microphone to capture and stream audio and video. Using this API allows you to get access to input devices such as the microphone and the web camera. When a developer integrates WebRTC into their website, they can create constraints on how they want the audio and video streamed.

Is WebRTC peer-to-peer?

WebRTC is a peer-peer communication protocol.


1 Answers

Google Meet is using WebRTC. The "peer" in that case is a server, not a browser. While six years old and some details have changed, much of this old article is still true. From the server Google can do audio processing.

This video describes the architecture required for speech-to-text (and actually translation + text-to-speech again).

like image 155
Philipp Hancke Avatar answered Sep 27 '22 22:09

Philipp Hancke