Continuous listen the user voice and detect end of speech silence in SpeechKit framework

Question

I have working an application where we need to open certain screen based on voice command like if user says "Open Setting" then it should open the setting screen, so far that I have used the SpeechKit framework but I am not able to detect the end of speech silence. Like how Siri does it. I want to detect if the user has ended his sentence/phrase.

Please find the below code for same where I have integrate the SpeechKit framework in two ways.

A) Via closure(recognitionTask(with request: SFSpeechRecognitionRequest, resultHandler: @escaping (SFSpeechRecognitionResult?, Error?) -> Swift.Void) -> SFSpeechRecognitionTask)

let audioEngine = AVAudioEngine()
let speechRecognizer = SFSpeechRecognizer()
let request = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask: SFSpeechRecognitionTask?

func startRecording() throws {

        let node = audioEngine.inputNode
        let recordingFormat = node.outputFormat(forBus: 0)

        node.installTap(onBus: 0, bufferSize: 1024,
                        format: recordingFormat) { [unowned self]
                            (buffer, _) in
                            self.request.append(buffer)
        }

        audioEngine.prepare()
        try audioEngine.start()

        weak var weakSelf = self

        recognitionTask = speechRecognizer?.recognitionTask(with: request) {
            (result, error) in

            if result != nil {

                if let transcription = result?.bestTranscription {
                    weakSelf?.idenifyVoiceCommand(transcription)
                }
            }
        }            
}

But when I say any word/sentence like "Open Setting" then closure(recognitionTask(with:)) called multiple times and I have put the method(idenifyVoiceCommand) inside the closure which call multiple times, so how can I restrict to call only one time.

And I also review the Timer logic while googling it(SFSpeechRecognizer - detect end of utterance) but in my scenarion it does not work beacause I did not stop the audio engine as it continuously listening the user’s voice like Siri does.

B) Via delegate(SFSpeechRecognitionTaskDelegate)

speechRecognizer.recognitionTask(with: self.request, delegate: self)

func speechRecognitionTaskWasCancelled(_ task: SFSpeechRecognitionTask) {

}

func speechRecognitionTask(_ task: SFSpeechRecognitionTask, didFinishSuccessfully successfully: Bool) {

}

And I found that the delegate which handle when the end of speech occurs do not call it and accidentally call it after sometimes.

Muhammad Essa · Accepted Answer

I had the same issue until now.

I checked your question and I suppose the code below helps you achieve the same thing I did:

recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, 
resultHandler: { (result, error) in

    var isFinal = false

    if result != nil {

        self.inputTextView.text = result?.bestTranscription.formattedString
        isFinal = (result?.isFinal)!
    }

    if let timer = self.detectionTimer, timer.isValid {
        if isFinal {
            self.inputTextView.text = ""
            self.textViewDidChange(self.inputTextView)
            self.detectionTimer?.invalidate()
        }
    } else {
        self.detectionTimer = Timer.scheduledTimer(withTimeInterval: 1.5, repeats: false, block: { (timer) in
            self.handleSend()
            isFinal = true
            timer.invalidate()
        })
    }

})

This checks if input wasn't received for 1.5 seconds

Continuous listen the user voice and detect end of speech silence in SpeechKit framework

Tags:

ios10

swift

speech-recognition

speech-to-text

speechkit

Ramkrishna Sharma

1 Answers

Muhammad Essa

Recent Activity

Donate For Us

Continuous listen the user voice and detect end of speech silence in SpeechKit framework

Tags:

ios10

swift

speech-recognition

speech-to-text

speechkit

Ramkrishna Sharma

1 Answers

Muhammad Essa

Related questions

Recent Activity

Donate For Us