When using Text-To-Speech, I want background audio to dim (or 'duck'), speak an utterance and then un-duck the background audio. It mostly works, however when trying to un-duck, it stays ducked without an error thrown in the deactivation.
The method that speaks an utterance:
// Create speech utterance
AVSpeechUtterance *speechUtterance = [[AVSpeechUtterance alloc]initWithString:textToSpeak];
speechUtterance.rate = instance.speechRate;
speechUtterance.pitchMultiplier = instance.speechPitch;
speechUtterance.volume = instance.speechVolume;
speechUtterance.postUtteranceDelay = 0.005;
AVSpeechSynthesisVoice *voice = [AVSpeechSynthesisVoice voiceWithLanguage:instance.voiceLanguageCode];
speechUtterance.voice = voice;
if (instance.speechSynthesizer.isSpeaking) {
[instance.speechSynthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
}
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError *activationError = nil;
[audioSession setActive:YES error:&activationError];
if (activationError) {
NSLog(@"Error activating: %@", activationError);
}
[instance.speechSynthesizer speakUtterance:speechUtterance];
Then deactivating it when speechUtterance
is finished speaking:
- (void)speechSynthesizer:(AVSpeechSynthesizer *)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance *)utterance
{
dispatch_queue_t myQueue = dispatch_queue_create("com.company.appname", nil);
dispatch_async(myQueue, ^{
NSError *error = nil;
if (![[AVAudioSession sharedInstance] setActive:NO withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error]) {
NSLog(@"Error deactivating: %@", error);
}
});
}
Setting the app's audio category in the App Delegate:
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions
{
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError *setCategoryError = nil;
[audioSession setCategory:AVAudioSessionCategoryPlayback
withOptions:AVAudioSessionCategoryOptionDuckOthers error:&setCategoryError];
}
The ducking/unducking works when I deactivate the AVAudioSession
after a delay:
dispatch_time_t popTime = dispatch_time(DISPATCH_TIME_NOW, 0.2 * NSEC_PER_SEC);
dispatch_after(popTime, dispatch_queue_create("com.company.appname", nil), ^(void){
NSError *error = nil;
if (![[AVAudioSession sharedInstance] setActive:NO withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error]) {
NSLog(@"Error deactivating: %@", error);
}
});
However, the delay is noticeable and I get an error in the console:
[avas] AVAudioSession.mm:1074:-[AVAudioSession setActive:withOptions:error:]: Deactivating an audio session that has running I/O. All I/O should be stopped or paused prior to deactivating the audio session.
How can I combine AVSpeechSynthesizer
with ducking of background audio properly?
EDIT: Apparently the issue stems from using postUtteranceDelay
on AVSpeechUtterance
, that causes the music to keep being dimmed. Removing that property fixes the issue. However, I need postUtteranceDelay
for some of my utterances, so I have updated the title.
the ducking worked (started and stopped) without any issue/error using your code while listening to Spotify. i used a iPhone 6S on iOS 9.1 so it is possible that this is an iOS 10 issue.
i would recommend removing the dispatch wrap entirely as it shouldn't be necessary. this may resolve the issue for you.
working code sample is below, all i did was create a new project ("Single View Application") and changed my AppDelegate.m to look like this:
#import "AppDelegate.h"
@import AVFoundation;
@interface AppDelegate () <AVSpeechSynthesizerDelegate>
@property (nonatomic, strong) AVSpeechSynthesizer *speechSynthesizer;
@end
@implementation AppDelegate
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError *setCategoryError = nil;
[audioSession setCategory:AVAudioSessionCategoryPlayback withOptions:AVAudioSessionCategoryOptionDuckOthers error:&setCategoryError];
if (setCategoryError) {
NSLog(@"error setting up: %@", setCategoryError);
}
self.speechSynthesizer = [[AVSpeechSynthesizer alloc] init];
self.speechSynthesizer.delegate = self;
AVSpeechUtterance *speechUtterance = [[AVSpeechUtterance alloc] initWithString:@"Hi there, how are you doing today?"];
AVSpeechSynthesisVoice *voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"en-US"];
speechUtterance.voice = voice;
NSError *activationError = nil;
[audioSession setActive:YES error:&activationError];
if (activationError) {
NSLog(@"Error activating: %@", activationError);
}
[self.speechSynthesizer speakUtterance:speechUtterance];
return YES;
}
- (void)speechSynthesizer:(AVSpeechSynthesizer *)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance *)utterance {
NSError *error = nil;
if (![[AVAudioSession sharedInstance] setActive:NO withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error]) {
NSLog(@"Error deactivating: %@", error);
}
}
@end
the only output from the console when running on a physical device is:
2016-12-21 09:42:08.484 DimOtherAudio[19017:3751445] Building MacinTalk voice for asset: (null)
UPDATE
setting the postUtteranceDelay
property created the same issue for me.
the documentation for postUtteranceDelay
states this:
The amount of time a speech synthesizer will wait after the utterance is spoken before handling the next queued utterance.
When two or more utterances are spoken by an instance of AVSpeechSynthesizer, the time between periods when either is audible will be at least the sum of the first utterance’s postUtteranceDelay and the second utterance’s preUtteranceDelay.
it is pretty clear from the documentation that this value is only designed to be used when another utterance will be added. i confirmed that adding a second utterance which hasn't set postUtteranceDelay
unducks the audio.
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError *setCategoryError = nil;
[audioSession setCategory:AVAudioSessionCategoryPlayback withOptions:AVAudioSessionCategoryOptionDuckOthers error:&setCategoryError];
if (setCategoryError) {
NSLog(@"error setting up: %@", setCategoryError);
}
self.speechSynthesizer = [[AVSpeechSynthesizer alloc] init];
self.speechSynthesizer.delegate = self;
AVSpeechUtterance *speechUtterance = [[AVSpeechUtterance alloc] initWithString:@"Hi there, how are you doing today?"];
speechUtterance.postUtteranceDelay = 0.005;
AVSpeechSynthesisVoice *voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"en-US"];
speechUtterance.voice = voice;
NSError *activationError = nil;
[audioSession setActive:YES error:&activationError];
if (activationError) {
NSLog(@"Error activating: %@", activationError);
}
[self.speechSynthesizer speakUtterance:speechUtterance];
// second utterance without postUtteranceDelay
AVSpeechUtterance *speechUtterance2 = [[AVSpeechUtterance alloc] initWithString:@"Duck. Duck. Goose."];
[self.speechSynthesizer speakUtterance:speechUtterance2];
Here's my Swift 3 version, taken from Casey's answer above:
import Foundation
import AVFoundation
class Utils: NSObject {
static let shared = Utils()
let synth = AVSpeechSynthesizer()
let audioSession = AVAudioSession.sharedInstance()
override init() {
super.init()
synth.delegate = self
}
func say(sentence: String) {
do {
try audioSession.setCategory(AVAudioSessionCategoryPlayback, with: AVAudioSessionCategoryOptions.duckOthers)
let utterance = AVSpeechUtterance(string: sentence)
try audioSession.setActive(true)
synth.speak(utterance)
} catch {
print("Uh oh!")
}
}
}
extension Utils: AVSpeechSynthesizerDelegate {
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
do {
try audioSession.setActive(false)
} catch {
print("Uh oh!")
}
}
}
I then call this anywhere in my app like: Utils.shared.say(sentence: "Thanks Casey!")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With