Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google speech API [closed]

Tags:

I'm now working with my project and I'm about to build a Siri-like application for the desktop computer. I am thinking if Google Speech API is reliable and accurate for speech recognition? Can you suggest to me what speech API is the most accurate in terms of speech recognition? Most preferably a free API. Thank you.

like image 333
Dheby Chan Avatar asked Oct 04 '12 06:10

Dheby Chan


People also ask

Is Google speech-to-text API open source?

Google today open-sourced the speech engine that powers its Android speech recognition transcription tool Live Transcribe. The company hopes doing so will let any developer deliver captions for long-form conversations. The source code is available now on GitHub.

Can Google Speech API be used offline?

Android does have offline speech recognition capabilities. You can activate this by going to Settings - Language and Input - Voice Input and touch the cog icon next to Enhanced Google Services.

Is Google Speech API free?

Accurately convert speech into text with an API powered by the best of Google's AI research and technology. New customers get $300 in free credits to spend on Speech-to-Text. All customers get 60 minutes for transcribing and analyzing audio free per month, not charged against your credits.

What is Google Speech API?

Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy to use API. The API recognizes over 80 languages and variants, to support your global user base.


1 Answers

While the Google speech API is free it is not an official public API. Some people have reverse engineered it, as is discussed in this blog. If you are planning on accessing the API directly for a commercial product I would not recommend it because they can drop it or change it without warning, breaking your product. This recently happened to developers that used the Google Weather API. If you are accessing it through a Chrome browser using x-webkit-speech on the other hand you are probably safe since it is supported by Google. Google's speech recognition is right up there with a lot of the more popular commercial solutions. They have a lot of experience with it in other projects like Google Voice and the now defunct Google 411. They have some of the top speech scientists working for them. The only other free alternative I can think of is Sphinx which is an open source project out of Carnegie Mellon University. Steep learning curve using this solution and if you want it to be setup as a service you will have to develop that yourself. Nuance is the other big player in the speech recognition market (I believe that is what Siri uses) and they do have solutions that offer speech recognition as a service. But they are pricey.

Update on Answer From Comments on Language Support

Windows Speech Recognition supports other languages, as does most speech recognition systems. But the caveat is that you have to tell the system what language to use and it has to support the language in question. Each vendor has a list of languages it supports and they are specific to a region. For example a vendor may support Mexican Spanish, American Spanish and Spain Spanish; which all have slightly different dialects. But the speech recognition engine can only support one language/dialect at a timer per user. A user cannot speak multiple languages to a speech recognition system without first requesting it to change to that language.

Updated 3/17/2014

The x-webkit-speech input field is being deprecated due to lack of support in other browsers. This will be replaced with the Web Speech API, which is a javascript API. You can find an example on how to use it here.

like image 198
Kevin Junghans Avatar answered Oct 08 '22 06:10

Kevin Junghans