In the documentation page https://cloud.google.com/speech/ there is a demo example that listens to speech via the browser and uses the API in the background. Is the source for this demo available?
If not, the speech API uses flac files. Is there any open source project that can record flacfiles from the browser (that is compatible with this API - there are many github projects out there but wondering if there is an official one)?
You can simply speak in a microphone and Google API will translate this into written text. The API has excellent results for English language. A speech recognition API offloads the logic, such that you can simply send a web request to the API, which then returns the text that was recognized.
Text-to-Speech is priced based on the number of characters sent to the service to be synthesized into audio each month. You must enable billing to use Text-to-Speech, and will be automatically charged if your usage exceeds the number of free characters allowed per month.
Google today open-sourced the speech engine that powers its Android speech recognition transcription tool Live Transcribe. The company hopes doing so will let any developer deliver captions for long-form conversations. The source code is available now on GitHub.
In case it helps someone, this approach is a good start to start recording audio from the browser:
https://github.com/GoogleCloudPlatform/nodejs-docs-samples/blob/master/speech/recognize.js
https://developers.google.com/web/fundamentals/native-hardware/recording-audio/#acquire_access_to_the_microphone
and https://github.com/mattdiamond/Recorderjs
Edit: The solution was open-sourced as its own project: https://github.com/gridcellcoder/cloud-speech-and-vision-demos
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With