Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speech Recognition API

I need to automatically transcribe some short MP3s as part of a proof of concept I am working on. I am currently looking into cloud solutions or web API services to send the MP3 as a simple HTTP request and receive a transcription back.

The only free/open source solution I have found here, but the demos don't seem to work (at least not on the files I need to transcribe). I have found some enterprise solutions for call centers, but so far nothing I can simply integrate into a project.

Are there any web based speech recognition services available? One that is able to filter out small noise would be a plus.

like image 366
MrGlass Avatar asked Nov 10 '10 06:11

MrGlass


3 Answers

Here is an unofficial method to access Google ASR capability. I just tested on Yesterday and it still works - you can get JSON style ASR output with words and associated confidence score from an FLC audio sampled in 16KHz.

like image 172
Leo5188 Avatar answered Nov 08 '22 06:11

Leo5188


Also you can try speech recognition engine of Windows 7 to produce subtitles. Here is the tool for that.

like image 1
VahidN Avatar answered Nov 08 '22 08:11

VahidN


This may be a good match. Also, their techcrunch profile (See this) lists competitors as: SimulScribe, SpinVox, Vlingo, Nuance, Microsoft, Google Some of these links may be helpful.

Vlingo, Bing and Google have recognizers in the cloud, but I don't think they make them publicly programmable. I believe they are accessible only from their authorized clients.

For a proof of concept (and low volume), have you considered just using the desktop speech engines that come in Windows 7? What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? may be helpful. The MS desktop recognizers ship with a dictation grammar and it sounds like that is what you will need.

like image 1
Michael Levy Avatar answered Nov 08 '22 06:11

Michael Levy