Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the current best speech recognition API for ios to match few keywords? [closed]

I am looking for an API for ios (free ideally) that will allow to do some speech recognition. I have seen few posts for this: iPhone speech recognition API? and free speech recognition engines for iOS? and after a bit of prospect i have gathered the sdk that looks quite interesting:

  • http://dragonmobile.nuancemobiledeveloper.com/public/index.php?task=home
  • http://www.politepix.com/openears
  • http://www.creaceed.com/ceedvocalsdk/ (not free :-\ )
  • http://www.ispeech.org/

is there any of those that really stand out of the crowd and quite recent? how do they really differentiate from each other?

like image 333
tiguero Avatar asked Feb 08 '12 22:02

tiguero


People also ask

What is speech recognition API?

The speech recognition part of the Web Speech API allows authorized Web applications to access the device's microphone and produces a transcript of the voice being recorded. This allows Web applications to use voice as one of the input & control method, similar to touch or keyboard.

Is Google speech API free?

The Google speech analytics API isn't free. However, it does offer up to 60 minutes of free speech recognition for audio, whereas, for longer than 60 minutes of audio transcription it charges $0.006 per second.

Is Google speech-to-text API open source?

Google today open-sourced the speech engine that powers its Android speech recognition transcription tool Live Transcribe. The company hopes doing so will let any developer deliver captions for long-form conversations. The source code is available now on GitHub.

How does Google ASR work?

A Speech-to-Text API synchronous recognition request is the simplest method for performing recognition on speech audio data. Speech-to-Text can process up to 1 minute of speech audio data sent in a synchronous request. After Speech-to-Text processes and recognizes all of the audio, it returns a response.


2 Answers

If you want to track just few keywords, you should not look for speech recognition API or service. This task is called Keyword Spotting and it uses different algorithms than speech recognition. Speech recognition tries to find all the words that has been said and because of that it consumes way more resources than keyword spotting. Keyword spotter only tries to find few selected keywords or keyphrases. It's way simple and way less resource consuming.

The only possible solution to archive this funcitonality is to use open source package like OpenEars powered by Pocketsphinx

http://www.politepix.com/openears

Openears has Rejecto plugin that implements something similar.

Pocketsphinx itself has recently implemented open source effective keyword spotting too, but it didn't get into Openers yet. It's only available through pocketsphinx API, you need to create kws search and set the target word to look for. I hope soon this functionality will reach OpenEars too.

like image 53
Nikolay Shmyrev Avatar answered Sep 19 '22 15:09

Nikolay Shmyrev


Nuance gives developers free access (but not for high volume) - See http://www.masshightech.com/stories/2011/09/26/daily13-Nuance-tweaks-mobile-dev-program-with-free-access-to-Dragon.html or http://dragonmobile.nuancemobiledeveloper.com/public/index.php?task=home

Nuance services are typically offered commercially and require up front fees and transaction fees. The interesting news above is that they now make low volume use of their services available to developers for free. So, for development, testing, and demonstration you can probably use the free Nuance services. However, unlike the Google services that come free in Android, if your app has thousands of users you will likely have to pay for Nuance services.

like image 28
Michael Levy Avatar answered Sep 19 '22 15:09

Michael Levy