Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Availability of installed voices for use by AVSpeechSynthesis in iOS

I would like to be able to test which text-to-speech voices are available for my iOS app to use with AVSpeechSynthesis. It is easy to generate a list of the installed voices, but Apple makes some of them are off-limits for use by apps, and I would like to know which.

For example, consider the following test code (swift 5.1):

import AVFoundation

...

func voiceTest() {
    let speechSynthesizer = AVSpeechSynthesizer()
    let voices = AVSpeechSynthesisVoice.speechVoices()
    for voice in voices where voice.language == "en-US" {
        print("\(voice.language) - \(voice.name) - \(voice.quality.rawValue) [\(voice.identifier)]")
        let phrase = "The voice you're now listening to is the one called \(voice.name)."
        let utterance = AVSpeechUtterance(string: phrase)
        utterance.voice = voice
        speechSynthesizer.speak(utterance)
    }
}

When I call voiceTest(), the console output is this:

en-US - Nicky (Enhanced) - 2 [com.apple.ttsbundle.siri_female_en-US_premium]
en-US - Aaron - 1 [com.apple.ttsbundle.siri_male_en-US_compact]
en-US - Fred - 1 [com.apple.speech.synthesis.voice.Fred]
en-US - Nicky - 1 [com.apple.ttsbundle.siri_female_en-US_compact]
en-US - Samantha - 1 [com.apple.ttsbundle.Samantha-compact]
en-US - Alex - 2 [com.apple.speech.voice.Alex]

Some of the voices speak in their actual voice, whereas some of them speak in the default voice instead. In my case both Nicky (com.apple.ttsbundle.siri_female_en-US_premium) and Alex (com.apple.speech.voice.Alex) are listed as high quality but sound instead like the low quality default, Samantha, when selected.

I know that Apple has said that the Siri voices are not available for use in third party apps. When I manually load Samantha (High Quality) on my iPhone via Settings, it appears in the list and I can use it. Perhaps Alex is just the high-quality male Siri voice, even though Aaron would seem to be the low-quality Siri voice based on its identifier (com.apple.ttsbundle.siri_male_en-US_compact)? And that's why Alex and Nicky are the only two to be unavailable? So that if I have my app specifically exclude those it will generate the true list of available voices? It would be nice to have some clarity.

like image 559
Anton Avatar asked Feb 07 '20 15:02

Anton


People also ask

How to change text to Speech voice on iOS?

Go to Settings > Accessibility and tap Spoken Content. Turn on Speak Selection or Speak Screen, or both. Select Voices. Choose the voice and dialect that you want Speak Screen and Speak Selection to use.

How to change text to Speech voice to english?

In the "Accessibility" section, select Manage accessibility features. Open Select-to-speak settings. Customize your Select-to-speak voice: Change the language and preferred voice: Under “Speech," choose the language and type of voice you want to hear.


1 Answers

I've been looking for a way to programmatically use Siri's nice sounding voice, such as English Siri Male (United States), and quickly discovered it is not possible using public Speech API even though the voice can be selected in System Preferences.

To answer your question, there are at least two other ways of finding available voices in addition to your code example.

Using defaults command

 defaults read com.apple.speech.voice.prefs > speech_prefs.txt

To find info on voice currently selected in System Preference, look for SelectedVoiceName in speech_prefs.txt.

For example, for English Siri Male (United States), this will be SelectedVoiceName = "Aaron Siri";.

Now, by further searching for aaron in speech_prefs.txt, you will find the following:

"VOICEID:com.apple.speech.synthesis.voice.custom.siri.aaron.premium_1" = {
    BundleIdentifier = "com.apple.speech.synthesis.voice.custom.siri.aaron.premium";

I tried both of these strings when initializing voice, but got error saying voice is not found.

Looking for voice directories

There seems to be three locations:

/System/Library/Speech/Voices

,

/Library/Speech/Voices

and

~/Library/Speech/Voices

The third one seems to be a location for custom voices.

Each voice has its own directory.

If you compare Info.plist files of some programmatically available and programmatically unavailable voices, you will see that both have different structure. For example, the programmatically unavailable voice lacks some attributes that correspond to Speech API, such as VoiceSupportedCharacters. I believe this is because some voices are of the older generation and some are newer.

P.S.

Not directly relevant to your question, but just FYI: I'm still looking for a solution to use Siri's voice programmatically. One idea is to make a copy of the voice directory and play with its Info.plist. The other idea is to automate MacOS UI to trigger text-to-speech conversion through simulating key press bound to Speak selected text when the key is pressed option in System Preferences / Accessibility / Speech and then recording the audio.

I'd appreciate if anyone can share other ideas.

like image 163
Valera Grishin Avatar answered Oct 16 '22 19:10

Valera Grishin