Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intercepting input from OS X speech recognition utility

This question follows from OS X Yosemite (10.10) API for continuous speech recognition

OSX now has superb continuous speech recognition. But it doesn't appear to expose any API. I'm building custom HCI kit, and I need to catch this speech input in order to process it.

How to intercept it?

My first thought was that it may create some virtual keyboard device through which it sends key-down/key-up events. If that were the case I could intercept using IOKit, but enumerating my keyboard devices it doesn't appear. So it must be something higher-level.

Please note I'm adding the 'hacking' tag, as it appears that there is no ready-made path -- it is clearly something Apple did not intend to provide.

EDIT:
How to use DictationServices.framework
Can I use OS X 10.8's speech recognition/dictation without a GUI?

like image 582
P i Avatar asked May 25 '15 11:05

P i


1 Answers

Sadly, NSSpeechRecognizer only listens for an array of commands (I mention that because you brought it up in your linked question). I've looked at a few different ways to capture the input but they're all pretty ghetto.

The most popular way to "intercept" the speech is to trigger the dictation command (fn + fn, unless the user has changed it) and enter the dictated text into a text field. Not exactly elegant, especially for an HCI kit.

If you're feeling frisky you could take a look at the private framework, DictationServices, but all of the standard warnings apply: App Store rejection, "Here be dragons," etc.

like image 97
Sabrina Avatar answered Sep 24 '22 10:09

Sabrina