I am doing live speech recognition with the new iOS10 framework. I use AVCaptureSession
to get to audio.
I have a "listening" beep sound to notify the user he can begin talking. The best way to put that sound is at the 1st call to captureOutput(:didOutputSampleBuffer..)
, but if I try to play a sound after starting the session the sound just won't play. And no error is thrown.. it just silently fail to play...
What I tried:
AudioServicesPlaySystemSound...()
)AVPlayer
It seems like regardless of what I am doing, it is impossible to trigger playing any kind of audio after triggering the recognition (not sure if it's specifically the AVCaptureSession
or the SFSpeechAudioBufferRecognitionRequest
/ SFSpeechRecognitionTask
...)
Any ideas? Apple even recommends playing a "listening" sound effect (and do it themselves with Siri) but I couldn't find any reference/example showing how to actually do it... (their "SpeakToMe" example doesn't play sound)
Well, apparently there are a bunch of "rules" one must follow in order to successfully begin a speech recognition session and play a "listening" effect only when (after) the recognition really began.
The session setup & triggering must be called on main queue. So:
DispatchQueue.main.async {
speechRequest = SFSpeechAudioBufferRecognitionRequest()
task = recognizer.recognitionTask(with: speechRequest, delegate: self)
capture = AVCaptureSession()
//.....
shouldHandleRecordingBegan = true
capture?.startRunning()
}
The "listening" effect should be player via AVPlayer
, not as a system sound.
The safest place to know we are definitely recording, is in the delegate call of AVCaptureAudioDataOutputSampleBufferDelegate
, when we get our first sampleBuffer callback:
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
//only once per recognition session
if shouldHandleRecordingBegan {
shouldHandleRecordingBegan = false
player = AVPlayer(url: Bundle.main.url(forResource: "listening", withExtension: "aiff")!)
player.play()
DispatchQueue.main.async {
//call delegate/handler closure/post notification etc...
}
}
// append buffer to speech recognition
speechRequest?.appendAudioSampleBuffer(sampleBuffer)
}
End of recognition effect is hell of a lot easier:
var ended = false
if task?.state == .running || task?.state == .starting {
task?.finish() // or task?.cancel() to cancel and not get results.
ended = true
}
if true == capture?.isRunning {
capture?.stopRunning()
}
if ended {
player = AVPlayer(url: Bundle.main.url(forResource: "done", withExtension: "aiff")!)
player.play()
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With