Text-to-speech (voice generation) and speech-to-text (voice recognition) APIs?

1 Answers

I'll rehash and update an answer from Speech recognition in C or Java or PHP?. This is by no means comprehensive, but it might be a start for you

From watching these questions for few months, I've seen most developer choices break down like this:

Windows folks - use the System.Speech features of .Net or Microsoft.Speech and install the free recognizers Microsoft provides. Windows 7 includes a full speech engine. Others are downloadable for free. There is a C++ API to the same engines known as SAPI. See at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. or http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx. More background on Microsoft engines for Windows What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?

Linux folks - Sphinx seems to have a good following. See http://cmusphinx.sourceforge.net/ and http://cmusphinx.sourceforge.net/wiki/

Commercial products - Nuance, Loquendo, AT&T, IBM, others. Each provide their own SDKs and libraries for various languages.

Online service - Nuance, Yapme, ispeech.org, vlingo, others. Nuance has improved their developer program and will now give you free access to their services for development. Yap (I believe) was recently purchased by Amazon, so we may see some changes there.

Of course this may also be helpful - http://en.wikipedia.org/wiki/List_of_speech_recognition_software

There is a Java speech API. See javax.speech.recognition in the Java Speech API http://java.sun.com/products/java-media/speech/forDevelopers/jsapi-guide/Recognition.html. I believe you still have to find a speech engine that supports this API. I don't think Sphinx fully supports it - http://cmusphinx.sourceforge.net/sphinx4/doc/Sphinx4-faq.html#support_jsapi

There are lots of other SO quesitons: Need text to speech and speech recognition tools for Linux and pyspeech (python) - Transcribe mp3 files? which talks about http://code.google.com/p/pyspeech/. You may also want to look at http://code.google.com/p/dragonfly/

148

answered Oct 21 '22 09:10

Michael Levy

Related questions
                            
                                How can I use Opus Codec from JavaScript
                            
                                Speech to Text on Android
                            
                                ios speech recognition Error Domain=kAFAssistantErrorDomain Code=216 "(null)"
                            
                                Why isn't speech recognition advancing? [closed]
                            
                                How to implement Mozilla DeepSpeech into PHP web app to convert Speech-to-text?
                            
                                The effect of the grammar in the Web Speech API
                            
                                Speech recognition API duplicated phrases on Android
                            
                                Realtime offline speech recognition in Python
                            
                                Creating ARPA language model file with 50,000 words
                            
                                How to use google speech recognition api in c#?
                            
                                Node js offline speech to text
                            
                                Synchronizing text and audio. Is there a NLP/speech-to-text library to do this?
                            
                                Comparison of Speech Recognition use in Android: by Intent or on-thread?
                            
                                RecognizerIntent.ACTION_GET_LANGUAGE_DETAILS in Oreo
                            
                                Google voice search on page load
                            
                                Is there any way to send audio file to the speech-to-text recognition
                            
                                How to query for the default SpeechRecognizer
                            
                                Listening for keywords at all times, like "Ok google" on 4.4 [closed]
                            
                                Is there a voice authentication library? [closed]
                            
                                How to implement speech-to-text via the Speech framework in Objective-C?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Text-to-speech (voice generation) and speech-to-text (voice recognition) APIs?

Tags:

speech-recognition

speech-to-text

text-to-speech

speech-synthesis

Vladimir Keleshev

People also ask

1 Answers

Michael Levy

Recent Activity

Donate For Us