Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dragon NaturallySpeaking Programmers

Is there anyway to encorporate Dragon NaturallySpeaking into an event driven program? My boss would really like it if I used DNS to record user voice input without writing it to the screen and saving it directly to XML. I've been doing research for several days now and I can not see a way for this to happen without the (really expensive) SDK, I don't even know that it would work then.

Microsoft has the ability to write a (Python) program where it's speech recognizer can wait until it detects a speech event and then process it. It also has the handy quality of being able to suggest alternative phrases to the one that it thinks is the best guess and recording the .wav file for later use. Sample code:

spEngine = MsSpeech()
spEngine.setEventHandler(RecoEventHandler(spEngine.context))

class RecoEventHandler(SpRecoContext):
def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
    res = win32com.client.Dispatch(Result)
    phrase = res.PhraseInfo.GetText()
    #from here I would save it as XML

    # write reco phrases
    altPhrases = reco.Alternates(NBEST)
    for phrase in altPhrases:
        nodePhrase = self.doc.createElement(TAG_PHRASE)

I can not seem to make DNS do this. The closest I can do-hickey it to is:

while keepGoing == True:
    yourWords = raw_input("Your input: ")
    transcript_el = createTranscript(doc, "user", yourWords)
    speech_el.appendChild(transcript_el)
    if yourWords == 'bye':
        break

It even has the horrible side effect of making the user say "new-line" after every sentence! Not the preferred solution at all! Is there anyway to make DNS do what Microsoft Speech does?

FYI: I know the logical solution would be to simply switch to Microsoft Speech but let's assume, just for grins and giggles, that that is not an option.

UPDATE - Has anyone bought the SDK? Did you find it useful?

like image 647
Danni Avatar asked Jun 01 '10 20:06

Danni


People also ask

What programs does Dragon NaturallySpeaking work with?

Put simply, Dragon allows you to speak into Word documents, web pages, Microsoft outlook, Word and almost every other application. The words will appear exactly where you were about to type them.

Is Dragon NaturallySpeaking worth it?

It's a great speech recognition solution for students, freelancers and people working from home. If you love writing and want to stay up-to-date without having to type, then this is one way to go. Dragon Home uses deep learning technology for faster typing and high accuracy.

Who owns Dragon NaturallySpeaking?

Microsoft has made its second-largest acquisition ever with the $19bn purchase of voice recognition specialist Nuance, the company behind Dragon speech recognition. Microsoft's acquisition of Nuance builds on the partnership formed by the companies in 2019.

What is Dragon NaturallySpeaking used for?

What is Dragon NaturallySpeaking Software? “Dragon speech recognition software makes it easier for anyone to use a computer. You talk, and it types. Use your voice to create and edit documents or emails, launch applications, open files, control your mouse, and more.


1 Answers

Solution: download Natlink - http://qh.antenna.nl/unimacro/installation/installation.html It's not quite as flexible to use as SAPI but it covers the basics and I got almost everything that I needed out of it. Also, heads up, it and Python need to be downloaded for all users on your machine or it won't work properly and it works for every version of Python BUT 2.4.

Documentation for all supported commands is found under C:\NatLink\NatLink\MiscScripts\natlink.txt after you download it. It's under all the updates at the top of the file.

Example code:

#make sure DNS is running before you start
if not natlink.isNatSpeakRunning():
  raiseError('must start up Dragon NaturallySpeaking first!')
  shutdownServer()
  return
#connect to natlink and load the grammer it's supposed to recognize
natlink.natConnect()
loggerGrammar = LoggerGrammar()
loggerGrammar.initialize()
if natlink.getMicState() == 'off':
   natlink.setMicState('on')
userName = 'Danni'
natlink.openUser(userName)
#natlink.waitForSpeech() continuous loop waiting for input. 
#Results are sent to gotResultsObject method of the logger grammar
natlink.waitForSpeech()
natlink.natDisconnect()

The code's severely abbreviated from my production version but I hope you get the idea. Only problem now is that I still have to returned to the mini-window natlink.waitForSpeech() creates to click 'close' before I can exit the program safely. A way to signal the window to close from python without using the timeout parameter would be fantastic.

like image 64
Danni Avatar answered Sep 22 '22 06:09

Danni