I am currently writing an AI program that receives input from Dragon NaturallySpeaking (using Natlink), processes it, and returns a spoken output. I was able to come up with a Receiver GrammarBase that captures all input from Dragon and sends it to my parser.
class Receiver(GrammarBase):
gramSpec = """ <start> exported = {emptyList}; """
def initialize(self):
self.load(self.gramSpec, allResults = 1)
self.activateAll()
def gotResultsObject(self, recogType, resObj):
if recogType == 'reject':
inpt, self.best_guess = [], []
else:
inpt = extract_words(resObj)
inpt = process_input(inpt) # Forms a list of possible interpretations
self.best_guess = resObj.getWords(0)
self.send_input(inpt)
def send_input(self, inpt):
send = send_to_parser(inpt) # Sends first possible interpretation to parser
try:
while True:
send.next() # Sends the next possible interpretation if the first is rejected
except StopIteration: # If all interpretations are rejected, try sending the input to Dragon
try:
recognitionMimic(parse(self.best_guess))
except MimicFailed: # If that fails too, execute all_failed
all_failed()
This code works as expected, but there are several problems:
Dragon processes the input before sending it to my program. For example, if I were to say "Open Google Chrome.", it would open Google Chrome, and then send the input to Python. Is there a way to send the input to Python without first processing it?
When I call waitForSpeech(), a message box pops up, stating that the Python interpreter is waiting for input. Is it possible (for aesthetics and convenience) to prevent the message box from showing up, and instead terminate the speech collecting process after a significant pause from the user?
Thank you!
With respect to your first question, it turns out that DNS uses the "Open ..." Utterance as part of its command resolving process internally. This means that DNS resolves the speech and executes the command way before natlink has a chance at it. The only way around this is to change the utterance from "Open ..." to "Trigger ..." in your natlink grammar (or to some other utterance that DNS is not using besides "Trigger").
Some of the natlink developers hang out at speechcomputing.com. You may get better responses there.
Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With