This question follows from OS X Yosemite (10.10) API for continuous speech recognition
OSX now has superb continuous speech recognition. But it doesn't appear to expose any API. I'm building custom HCI kit, and I need to catch this speech input in order to process it.
How to intercept it?
My first thought was that it may create some virtual keyboard device through which it sends key-down/key-up events. If that were the case I could intercept using IOKit, but enumerating my keyboard devices it doesn't appear. So it must be something higher-level.
Please note I'm adding the 'hacking' tag, as it appears that there is no ready-made path -- it is clearly something Apple did not intend to provide.
EDIT:
How to use DictationServices.framework
Can I use OS X 10.8's speech recognition/dictation without a GUI?
Sadly, NSSpeechRecognizer
only listens for an array of commands (I mention that because you brought it up in your linked question). I've looked at a few different ways to capture the input but they're all pretty ghetto.
The most popular way to "intercept" the speech is to trigger the dictation command (fn + fn, unless the user has changed it) and enter the dictated text into a text field. Not exactly elegant, especially for an HCI kit.
If you're feeling frisky you could take a look at the private framework, DictationServices
, but all of the standard warnings apply: App Store rejection, "Here be dragons," etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With