Need to be able to convert or transcribe audio (eg from .MP3, other audio format) containing speech into text transcripts using a speech to text (voice recognition) algorithm with high accuracy. There are many available ways of doing this that are increasingly accurate but are designed for speech spoken into the device microphone (e.g. the Google Translate/corresponding API for web, Dragon app for iOS). I need a way to directly feed an audio file into the speech recognition engine/API. Don't want to play the audio through a speaker and capture it with a microphone -- takes considerable time for long audio files, and degrades audio quality and resulting transcription quality. Does a web service, or API, or code for this exist? Is there some kind of a wrapper around one of the existing services that presume that the microphone will be the source?
Thanks
Step 1: Open Google docs and select 'tools,' then 'voice typing. ' Step 2: Select your language, then click the microphone icon. Step 3: Play the audio you want to transcribe and Google should automatically start transcribing.
Google today open-sourced the speech engine that powers its Android speech recognition transcription tool Live Transcribe. The company hopes doing so will let any developer deliver captions for long-form conversations. The source code is available now on GitHub.
Built on Google's speech-recognition engines, Speechnotes is a simple, clean, online dictation tool that helps users transcribe their speech into text with over 90% accuracy. And since you don't have to download, install, or register for Speechnotes, it's one of the most accessible dictation tools out there.
There is now a relatively new service that allows Speech to Text automatic transcription, and a great web interface for human editing of the results. It's:
https://trint.com/
We've used it, and been pleased with the results. The transcription is certainly not perfect, but it's a great start, and it allows ready human editing.
There is also now a new API and service available from IBM Bluemix/Watson. You can try the free demo here:
https://speech-to-text-demo.mybluemix.net/
This service does a pretty decent job of converting audio (sourced from the mic or from an audio file) into text. Currently at least in the demo it appears that it doesn't use MP3, but will use wav and other formats. This service has a full API, and it is primarily designed to be built into applications.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With