Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Server-side Voice Recognition [closed]

Anyone know of any good server side voice recognition engines that are already hosted? I.e. I want to be able to call a simple web API posting some sound data and get text back. Doesn't have to be free - but hopefully free to experiment with.

like image 629
aloo Avatar asked Jun 24 '10 21:06

aloo


People also ask

Why is speech recognition not working?

Here are some things to check first if voice typing isn't working: Make sure the microphone you want to use is selected in Settings. To check, select Start > Settings > System > Sound > Input > Choose a device for speaking or recording.


1 Answers

There are several IVR services which host an entire VOIP session (telephone call) as a complete application, rather than offer individual service transactions "àla carte". If you were to make your program look like a VOIP call, you might be able to get it done with some of these services.

Voxeo published a list of free (and low cost) IVR hosting providers aimed towards developers for limited use. Not surprisingly, all will require registration.

  • VoiceGenie Developer Workshop (absorbed into Genesys)
  • Loquendo C@fé status unknown
  • Nuance Café (Bevocal) now Nuance On-Demand
  • Plum Voice Hosting now Plum DEV
  • VOICE Testcenter of the VOICE Community

Another possibility would be to make a direct inquiries with Vlingo, Twilio, or Tropo as they might sell you exactly what you need.

UPDATE: July 25, 2012

AT&T has announced availability of a Speech API on . You send it audio – it returns text in XML or JSON data formats. See also developer site.

UPDATE: August 27, 2012

Another possibility is the Dragon Mobile SDK from Nuance, which is aimed at individual developers looking for an API enabling consumer applications with speech and/or text-to-speech functionality.

UPDATE: September 21, 2012

There seem to be several new providers offering exactly what you are looking for: speech samples in, text out. The following are listed on Programmable Web:

  • iSpeech
  • SpeechAPI
  • OneTok
  • AISpeech API
  • NexiWave

Also note that Loquendo is now part of Nuance.

UPDATE: June 27, 2013

AT&T's Speech API has a few targeted SDKs (Android, iOS, PhoneGap, Titanium, Windows) - some of which are hosted on GitHub. There's even source for a Unity 3D demo.

UPDATE: January 23, 2014

OneTok has reformulated it's offerings as an SDK for iOS and Android.

Apparently the Voice Genie product has been thoroughly digested by Genesys such that little trace of it can be found. Given Genesys' positioning towards large enterprises, is difficult to know if they have any small-volume or commodity offerings.

Plumvoice seems to have expanded their offerings.

As with many before it, Vlingo is now part of Nuance.

(I've tried to update any broken links in original answer.)

UPDATE: October 31, 2015

Keeping this answer up-to-date is a Sisyphean task.

Voxeo's list of free (and low cost) IVR hosting providers now re-derects to AT&T Speech API, which, in full disclosure, I now have material involvement with therein, and as such, disqualifies me from providing linking to pretty much anything without impugning my credibility.

That said, there are many players in the speech/NLP market. Do diligence.

UPDATE: April 8, 2016

So now Google is totally upsetting the apple cart.

like image 60
13 revs, 2 users 98% Avatar answered Nov 16 '22 08:11

13 revs, 2 users 98%