I think Google's speech to text facilities (Google Voice automatic transcription of voicemail, automatic captioning of videos on YouTube etc) are quite impressive.
I did look to see whether Google has made it available through an API and it seems they haven't (not that I blame them!). A cloud computing service providing speech to text functionality would be pretty cool though.
Is there some sort of "hack" that I can use to access the speech to text. My architecture basically comes down to this - a short 15-20 second wav/mp3/other clip as the input, output is plaintext.
Any ideas people?
There are a lot of speech to text APIs. Just because Google doesn't make theirs available, it doesn't mean you're out of luck.
Here is a good one for C#. You can search for others for your platform if it's not .NET.
http://cmusphinx.sourceforge.net/
Check this out: http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/
I am currently trying to implement the API in PHP.
--Seth
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With