I have working an application where we need to open certain screen based on voice command like if user says "Open Setting" then it should open the setting screen, so far that I have used the SpeechKit
framework but I am not able to detect the end of speech silence. Like how Siri does it. I want to detect if the user has ended his sentence/phrase.
Please find the below code for same where I have integrate the SpeechKit
framework in two ways.
A) Via closure(recognitionTask(with request: SFSpeechRecognitionRequest, resultHandler: @escaping (SFSpeechRecognitionResult?, Error?) -> Swift.Void) -> SFSpeechRecognitionTask
)
let audioEngine = AVAudioEngine()
let speechRecognizer = SFSpeechRecognizer()
let request = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask: SFSpeechRecognitionTask?
func startRecording() throws {
let node = audioEngine.inputNode
let recordingFormat = node.outputFormat(forBus: 0)
node.installTap(onBus: 0, bufferSize: 1024,
format: recordingFormat) { [unowned self]
(buffer, _) in
self.request.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
weak var weakSelf = self
recognitionTask = speechRecognizer?.recognitionTask(with: request) {
(result, error) in
if result != nil {
if let transcription = result?.bestTranscription {
weakSelf?.idenifyVoiceCommand(transcription)
}
}
}
}
But when I say any word/sentence like "Open Setting" then closure(recognitionTask(with:)
) called multiple times and I have put the method(idenifyVoiceCommand
) inside the closure which call multiple times, so how can I restrict to call only one time.
And I also review the Timer logic while googling it(SFSpeechRecognizer - detect end of utterance) but in my scenarion it does not work beacause I did not stop the audio engine as it continuously listening the user’s voice like Siri does.
B) Via delegate(SFSpeechRecognitionTaskDelegate
)
speechRecognizer.recognitionTask(with: self.request, delegate: self)
func speechRecognitionTaskWasCancelled(_ task: SFSpeechRecognitionTask) {
}
func speechRecognitionTask(_ task: SFSpeechRecognitionTask, didFinishSuccessfully successfully: Bool) {
}
And I found that the delegate which handle when the end of speech occurs do not call it and accidentally call it after sometimes.
I had the same issue until now.
I checked your question and I suppose the code below helps you achieve the same thing I did:
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest,
resultHandler: { (result, error) in
var isFinal = false
if result != nil {
self.inputTextView.text = result?.bestTranscription.formattedString
isFinal = (result?.isFinal)!
}
if let timer = self.detectionTimer, timer.isValid {
if isFinal {
self.inputTextView.text = ""
self.textViewDidChange(self.inputTextView)
self.detectionTimer?.invalidate()
}
} else {
self.detectionTimer = Timer.scheduledTimer(withTimeInterval: 1.5, repeats: false, block: { (timer) in
self.handleSend()
isFinal = true
timer.invalidate()
})
}
})
This checks if input wasn't received for 1.5 seconds
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With