I'm developing a website, and I would like to help blind people to use it by the voice, so I will use:
I already have some text-to-speech JavaScript libraries (like speak.js), but now I need a good speech-to-text one. There are some solutions for this purpose (like speechapi), but they use Java Applets or Flash, and I want to depend only on JavaScript, to avoid plugins.
I'm trying HTML5's speech input with x-webkit-speech and Google Chrome, and it is good, but you need to click over an icon (and blind people can't use a mouse well). Is it posible to use x-webkit-speech pressing a key? Do you know any alternative API (JavaScript)?
Thank you!
When speech input is enabled the element will have a small microphone icon displayed on the right of the input. Clicking on this icon will launch a small tooltip to show that your voice is now being recorded. You can also start speech input by focussing the element and pressing Ctrl + Shift + .
The speech recognition part of the Web Speech API allows authorized Web applications to access the device's microphone and produces a transcript of the voice being recorded. This allows Web applications to use voice as one of the input & control method, similar to touch or keyboard.
Is it posible to use x-webkit-speech pressing a key?
According to the this post and this post you cannot override the start of speech by clicking the microphone.
What the x-webkit-speech is doing is using the audio capture capabilities of HTML5 and sending the audio to Google's servers for processing, returning the results in JSON. This blogger has reversed engineered it. You could develop a JavaScript library that looks for a key press to start capturing audio on HTML5 enabled browsers and send it to Google's service or to one you have created. The downside to using Google's service is that it is an unsupported API and subject to change at any time. The downside to developing your own service is that it can be expensive to develop and maintain.
Do you know any alternative API (JavaScript)?
This post and this post lists some services available for speech recognition. I did not see Nuance listed. You may be able to use the Dragon Mobile SDK for this. And you may want to check into ISpeech.
Google Translate is very good Text To Speech Engine. I used to read a text with it. For example you have a text: welcome to Stack overflow
you can call like this
http://translate.google.com/translate_tts?ie=UTF-8&q=Welcome%20to%20stack%20overflow&tl=en&total=1&idx=0&textlen=23&prev=input
then use browser audio to play it
For speech input you can manual activate listening process, see here http://code.google.com/chrome/extensions/experimental.speechInput.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With