I have looked into Google Cloud Speech API and got streaming my microphone working on a Node server.
I was then wondering what would be best practice for streaming my microphone from a web frontend? Is it sending an audiostream from getUserMedia to the Node server and pipe it to the API with the Node API client? Or is is simply saving the voice input to a file that I then transmit to the API?
The intent is to "transcribe" instructions (one or two sentences long) and send the result to another API.
I'm aware this question is over a year old and the OP has probably either found an answer or given up, but I spent long enough trying in vain to google this before I figured it out that I wanted to help anyone following in my footsteps: I wrote up a tutorial for basically this exact situation here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With