Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speech to Text (Voice Recognition) Directly from Audio / Transcription [closed]

Need to be able to convert or transcribe audio (eg from .MP3, other audio format) containing speech into text transcripts using a speech to text (voice recognition) algorithm with high accuracy. There are many available ways of doing this that are increasingly accurate but are designed for speech spoken into the device microphone (e.g. the Google Translate/corresponding API for web, Dragon app for iOS). I need a way to directly feed an audio file into the speech recognition engine/API. Don't want to play the audio through a speaker and capture it with a microphone -- takes considerable time for long audio files, and degrades audio quality and resulting transcription quality. Does a web service, or API, or code for this exist? Is there some kind of a wrapper around one of the existing services that presume that the microphone will be the source?

Thanks

like image 228
user2330237 Avatar asked May 25 '14 21:05

user2330237


People also ask

How do I automatically transcribe audio to text?

Step 1: Open Google docs and select 'tools,' then 'voice typing. ' ‍Step 2: Select your language, then click the microphone icon. Step 3: Play the audio you want to transcribe and Google should automatically start transcribing.

Is Google speech to text API open source?

Google today open-sourced the speech engine that powers its Android speech recognition transcription tool Live Transcribe. The company hopes doing so will let any developer deliver captions for long-form conversations. The source code is available now on GitHub.

Is there a program that converts speech-to-text?

Built on Google's speech-recognition engines, Speechnotes is a simple, clean, online dictation tool that helps users transcribe their speech into text with over 90% accuracy. And since you don't have to download, install, or register for Speechnotes, it's one of the most accessible dictation tools out there.


1 Answers

There is now a relatively new service that allows Speech to Text automatic transcription, and a great web interface for human editing of the results. It's:

https://trint.com/

We've used it, and been pleased with the results. The transcription is certainly not perfect, but it's a great start, and it allows ready human editing.

There is also now a new API and service available from IBM Bluemix/Watson. You can try the free demo here:

https://speech-to-text-demo.mybluemix.net/

This service does a pretty decent job of converting audio (sourced from the mic or from an audio file) into text. Currently at least in the demo it appears that it doesn't use MP3, but will use wav and other formats. This service has a full API, and it is primarily designed to be built into applications.

like image 189
user2330237 Avatar answered Oct 21 '22 10:10

user2330237