Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speech recognition, nodeJS

I'm currently working on a tool allowing me to read all my notifications thanks to the connection to different APIs.

It's working great, but now I would like to put some vocal commands to do some actions.

Like when the software is saying "One mail from Bob", I would like to say "Read it", or "Archive it".

My software is running through a node server, currently I don't have any browser implementation, but it can be a plan.

What is the best way in node JS to enable speech to text?

I've seen a lot of threads on it, but mainly it's using the browser and if possible, I would like to avoid that at the beginning. Is it possible?

Another issue is some software requires the input of a wav file. I don't have any file, I just want my software to be always listening to what I say to react when I say a command.

Do you have any information on how I could do that?

Cheers

like image 742
Vico Avatar asked Feb 26 '16 04:02

Vico


1 Answers

To get audio data into your application, you could try a module like microphone, which I haven't used by it looks promising. This could be a way to avoid having to use the browser for audio input.

To do actual speech recognition, you could use the Speech to Text service of IBM Watson Developer Cloud. This service supports a websocket interface, so that you can have a full duplex service, piping audio data to the cloud and getting back the resulting transcription. You may want to consider implementing a form of onset detection in order to avoid transmitting a lot of (relative) silence to the service - that way, you can stay within the free tier.

There is also a text-to-speech service, but it sounds like you have a solution already for that part of your tool.

Disclosure: I am an evangelist for IBM Watson.

like image 161
Abtin Forouzandeh Avatar answered Nov 15 '22 21:11

Abtin Forouzandeh