Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Web speech API: Consistently get the supported speech synthesis voices on iOS safari

I'm trying to get the list of supported speech synthesis voices on iOS Safari.

As per the API, I should be able to get an array of voices by calling:

window.speechSynthesis.getVoices();

Sometimes this gives me list of voices, other times it doesn't. See the following jsfiddle: https://jsfiddle.net/sq7xf327/

If I open this on my iPhone 5 (iOS 8.1.3), I don't get back consistent results. Sometimes I get back all 37 voices, and other times I get returned 0 voices. If you keep on refreshing it sporadically displays either 37 or 0.

I know that in Chrome you can add an event listener to the

window.speechSynthesis.voiceschanged 

event to know when the voices have loaded, but this event is not supported in Safari.

A trick I've tried is to check periodically:

var timer = setInterval(function () {
    window.voices_ = window.speechSynthesis.getVoices();
    if (window.voices_.length > 0) {
        clearInterval(timer);
    }
}, 1000);

This has also not given me consistent results.

Any idea how I can reliably and consistently get the supported speech synthesis voices on iOS Safari?

like image 772
Ken Adams Avatar asked Mar 09 '15 17:03

Ken Adams


2 Answers

I also encountered this and reported it as a bug to Apple. As of today the bug report is still open.

What I ended up doing as workaround was to hard-code an array of the 37 voices. If speechSynthesis.getVoices() returns an empty array then use the hard-coded array instead.

var _voices = [];

// iOS 8
var _iOSvoices = [
    {name: "pt-BR", voiceURI: "pt-BR", lang: "pt-BR", localService: true, default: true},
    {name: "fr-CA", voiceURI: "fr-CA", lang: "fr-CA", localService: true, default: true},
    {name: "sk-SK", voiceURI: "sk-SK", lang: "sk-SK", localService: true, default: true},
    {name: "th-TH", voiceURI: "th-TH", lang: "th-TH", localService: true, default: true},
    {name: "ro-RO", voiceURI: "ro-RO", lang: "ro-RO", localService: true, default: true},
    {name: "no-NO", voiceURI: "no-NO", lang: "no-NO", localService: true, default: true},
    {name: "fi-FI", voiceURI: "fi-FI", lang: "fi-FI", localService: true, default: true},
    {name: "pl-PL", voiceURI: "pl-PL", lang: "pl-PL", localService: true, default: true},
    {name: "de-DE", voiceURI: "de-DE", lang: "de-DE", localService: true, default: true},
    {name: "nl-NL", voiceURI: "nl-NL", lang: "nl-NL", localService: true, default: true},
    {name: "id-ID", voiceURI: "id-ID", lang: "id-ID", localService: true, default: true},
    {name: "tr-TR", voiceURI: "tr-TR", lang: "tr-TR", localService: true, default: true},
    {name: "it-IT", voiceURI: "it-IT", lang: "it-IT", localService: true, default: true},
    {name: "pt-PT", voiceURI: "pt-PT", lang: "pt-PT", localService: true, default: true},
    {name: "fr-FR", voiceURI: "fr-FR", lang: "fr-FR", localService: true, default: true},
    {name: "ru-RU", voiceURI: "ru-RU", lang: "ru-RU", localService: true, default: true},
    {name: "es-MX", voiceURI: "es-MX", lang: "es-MX", localService: true, default: true},
    {name: "zh-HK", voiceURI: "zh-HK", lang: "zh-HK", localService: true, default: true},
    {name: "sv-SE", voiceURI: "sv-SE", lang: "sv-SE", localService: true, default: true},
    {name: "hu-HU", voiceURI: "hu-HU", lang: "hu-HU", localService: true, default: true},
    {name: "zh-TW", voiceURI: "zh-TW", lang: "zh-TW", localService: true, default: true},
    {name: "es-ES", voiceURI: "es-ES", lang: "es-ES", localService: true, default: true},
    {name: "zh-CN", voiceURI: "zh-CN", lang: "zh-CN", localService: true, default: true},
    {name: "nl-BE", voiceURI: "nl-BE", lang: "nl-BE", localService: true, default: true},
    {name: "en-GB", voiceURI: "en-GB", lang: "en-GB", localService: true, default: true},
    {name: "ar-SA", voiceURI: "ar-SA", lang: "ar-SA", localService: true, default: true},
    {name: "ko-KR", voiceURI: "ko-KR", lang: "ko-KR", localService: true, default: true},
    {name: "cs-CZ", voiceURI: "cs-CZ", lang: "cs-CZ", localService: true, default: true},
    {name: "en-ZA", voiceURI: "en-ZA", lang: "en-ZA", localService: true, default: true},
    {name: "en-AU", voiceURI: "en-AU", lang: "en-AU", localService: true, default: true},
    {name: "da-DK", voiceURI: "da-DK", lang: "da-DK", localService: true, default: true},
    {name: "en-US", voiceURI: "en-US", lang: "en-US", localService: true, default: true},
    {name: "en-IE", voiceURI: "en-IE", lang: "en-IE", localService: true, default: true},
    {name: "he-IL", voiceURI: "he-IL", lang: "he-IL", localService: true, default: true},
    {name: "hi-IN", voiceURI: "hi-IN", lang: "hi-IN", localService: true, default: true},
    {name: "el-GR", voiceURI: "el-GR", lang: "el-GR", localService: true, default: true},
    {name: "ja-JP", voiceURI: "ja-JP", lang: "ja-JP", localService: true, default: true}
];

function populateVoices() {
    // wait first
    var watch = setTimeout(function() {
        _voices = speechSynthesis.getVoices();

        if (_voices.length === 0) {
            // use hard-coded list because speechSynthesis.getVoices() didn't work
            _voices = _iOSvoices;
        }

        clearTimeout(watch);
    }, 100);
}

UPDATE

iOS 9 is somewhat better at this if you remove the delay.

function _populateVoices() {
    _voices = speechSynthesis.getVoices();

    if (_voices.length === 0) {
        // use hard-coded list because speechSynthesis.getVoices() didn't work
        _voices = _iOS9voices;
    }
}

var _iOS9voices = [
  { name: "Maged", voiceURI: "com.apple.ttsbundle.Maged-compact", lang: "ar-SA", localService: true, "default": true },
  { name: "Zuzana", voiceURI: "com.apple.ttsbundle.Zuzana-compact", lang: "cs-CZ", localService: true, "default": true },
  { name: "Sara", voiceURI: "com.apple.ttsbundle.Sara-compact", lang: "da-DK", localService: true, "default": true },
  { name: "Anna", voiceURI: "com.apple.ttsbundle.Anna-compact", lang: "de-DE", localService: true, "default": true },
  { name: "Melina", voiceURI: "com.apple.ttsbundle.Melina-compact", lang: "el-GR", localService: true, "default": true },
  { name: "Karen", voiceURI: "com.apple.ttsbundle.Karen-compact", lang: "en-AU", localService: true, "default": true },
  { name: "Daniel", voiceURI: "com.apple.ttsbundle.Daniel-compact", lang: "en-GB", localService: true, "default": true },
  { name: "Moira", voiceURI: "com.apple.ttsbundle.Moira-compact", lang: "en-IE", localService: true, "default": true },
  { name: "Samantha (Enhanced)", voiceURI: "com.apple.ttsbundle.Samantha-premium", lang: "en-US", localService: true, "default": true },
  { name: "Samantha", voiceURI: "com.apple.ttsbundle.Samantha-compact", lang: "en-US", localService: true, "default": true },
  { name: "Tessa", voiceURI: "com.apple.ttsbundle.Tessa-compact", lang: "en-ZA", localService: true, "default": true },
  { name: "Monica", voiceURI: "com.apple.ttsbundle.Monica-compact", lang: "es-ES", localService: true, "default": true },
  { name: "Paulina", voiceURI: "com.apple.ttsbundle.Paulina-compact", lang: "es-MX", localService: true, "default": true },
  { name: "Satu", voiceURI: "com.apple.ttsbundle.Satu-compact", lang: "fi-FI", localService: true, "default": true },
  { name: "Amelie", voiceURI: "com.apple.ttsbundle.Amelie-compact", lang: "fr-CA", localService: true, "default": true },
  { name: "Thomas", voiceURI: "com.apple.ttsbundle.Thomas-compact", lang: "fr-FR", localService: true, "default": true },
  { name: "Carmit", voiceURI: "com.apple.ttsbundle.Carmit-compact", lang: "he-IL", localService: true, "default": true },
  { name: "Lekha", voiceURI: "com.apple.ttsbundle.Lekha-compact", lang: "hi-IN", localService: true, "default": true },
  { name: "Mariska", voiceURI: "com.apple.ttsbundle.Mariska-compact", lang: "hu-HU", localService: true, "default": true },
  { name: "Damayanti", voiceURI: "com.apple.ttsbundle.Damayanti-compact", lang: "id-ID", localService: true, "default": true },
  { name: "Alice", voiceURI: "com.apple.ttsbundle.Alice-compact", lang: "it-IT", localService: true, "default": true },
  { name: "Kyoko", voiceURI: "com.apple.ttsbundle.Kyoko-compact", lang: "ja-JP", localService: true, "default": true },
  { name: "Yuna", voiceURI: "com.apple.ttsbundle.Yuna-compact", lang: "ko-KR", localService: true, "default": true },
  { name: "Ellen", voiceURI: "com.apple.ttsbundle.Ellen-compact", lang: "nl-BE", localService: true, "default": true },
  { name: "Xander", voiceURI: "com.apple.ttsbundle.Xander-compact", lang: "nl-NL", localService: true, "default": true },
  { name: "Nora", voiceURI: "com.apple.ttsbundle.Nora-compact", lang: "no-NO", localService: true, "default": true },
  { name: "Zosia", voiceURI: "com.apple.ttsbundle.Zosia-compact", lang: "pl-PL", localService: true, "default": true },
  { name: "Luciana", voiceURI: "com.apple.ttsbundle.Luciana-compact", lang: "pt-BR", localService: true, "default": true },
  { name: "Joana", voiceURI: "com.apple.ttsbundle.Joana-compact", lang: "pt-PT", localService: true, "default": true },
  { name: "Ioana", voiceURI: "com.apple.ttsbundle.Ioana-compact", lang: "ro-RO", localService: true, "default": true },
  { name: "Milena", voiceURI: "com.apple.ttsbundle.Milena-compact", lang: "ru-RU", localService: true, "default": true },
  { name: "Laura", voiceURI: "com.apple.ttsbundle.Laura-compact", lang: "sk-SK", localService: true, "default": true },
  { name: "Alva", voiceURI: "com.apple.ttsbundle.Alva-compact", lang: "sv-SE", localService: true, "default": true },
  { name: "Kanya", voiceURI: "com.apple.ttsbundle.Kanya-compact", lang: "th-TH", localService: true, "default": true },
  { name: "Yelda", voiceURI: "com.apple.ttsbundle.Yelda-compact", lang: "tr-TR", localService: true, "default": true },
  { name: "Ting-Ting", voiceURI: "com.apple.ttsbundle.Ting-Ting-compact", lang: "zh-CN", localService: true, "default": true },
  { name: "Sin-Ji", voiceURI: "com.apple.ttsbundle.Sin-Ji-compact", lang: "zh-HK", localService: true, "default": true },
  { name: "Mei-Jia", voiceURI: "com.apple.ttsbundle.Mei-Jia-compact", lang: "zh-TW", localService: true, "default": true }
];
like image 147
Sarah Elan Avatar answered Nov 11 '22 15:11

Sarah Elan


I implement that JS functionality for my site for Desktop, Android and iOS.

As I understood, for Mobile, the speech utterance is regulated by GENERAL PHONE SETTINGS, so user need go to TTS setting and choose preferable voice for specify language, and maybe download the good voice. For example, iOS Alex voice is closed to 800mb and it works offline. List of supported languages you can find on vendor site for specific Android or iOS version.

So finally we have:

First, we need to have function to check is it mobile:

// Detecting a mobile browser

window.mobilecheck = function() {
    var check = false;
    (function(a){if(/(android|bb\d+|meego).+mobile|avantgo|bada\/|blackberry|blazer|compal|elaine|fennec|hiptop|iemobile|ip(hone|od)|iris|kindle|lge |maemo|midp|mmp|mobile.+firefox|netfront|opera m(ob|in)i|palm( os)?|phone|p(ixi|re)\/|plucker|pocket|psp|series(4|6)0|symbian|treo|up\.(browser|link)|vodafone|wap|windows ce|xda|xiino/i.test(a)||/1207|6310|6590|3gso|4thp|50[1-6]i|770s|802s|a wa|abac|ac(er|oo|s\-)|ai(ko|rn)|al(av|ca|co)|amoi|an(ex|ny|yw)|aptu|ar(ch|go)|as(te|us)|attw|au(di|\-m|r |s )|avan|be(ck|ll|nq)|bi(lb|rd)|bl(ac|az)|br(e|v)w|bumb|bw\-(n|u)|c55\/|capi|ccwa|cdm\-|cell|chtm|cldc|cmd\-|co(mp|nd)|craw|da(it|ll|ng)|dbte|dc\-s|devi|dica|dmob|do(c|p)o|ds(12|\-d)|el(49|ai)|em(l2|ul)|er(ic|k0)|esl8|ez([4-7]0|os|wa|ze)|fetc|fly(\-|_)|g1 u|g560|gene|gf\-5|g\-mo|go(\.w|od)|gr(ad|un)|haie|hcit|hd\-(m|p|t)|hei\-|hi(pt|ta)|hp( i|ip)|hs\-c|ht(c(\-| |_|a|g|p|s|t)|tp)|hu(aw|tc)|i\-(20|go|ma)|i230|iac( |\-|\/)|ibro|idea|ig01|ikom|im1k|inno|ipaq|iris|ja(t|v)a|jbro|jemu|jigs|kddi|keji|kgt( |\/)|klon|kpt |kwc\-|kyo(c|k)|le(no|xi)|lg( g|\/(k|l|u)|50|54|\-[a-w])|libw|lynx|m1\-w|m3ga|m50\/|ma(te|ui|xo)|mc(01|21|ca)|m\-cr|me(rc|ri)|mi(o8|oa|ts)|mmef|mo(01|02|bi|de|do|t(\-| |o|v)|zz)|mt(50|p1|v )|mwbp|mywa|n10[0-2]|n20[2-3]|n30(0|2)|n50(0|2|5)|n7(0(0|1)|10)|ne((c|m)\-|on|tf|wf|wg|wt)|nok(6|i)|nzph|o2im|op(ti|wv)|oran|owg1|p800|pan(a|d|t)|pdxg|pg(13|\-([1-8]|c))|phil|pire|pl(ay|uc)|pn\-2|po(ck|rt|se)|prox|psio|pt\-g|qa\-a|qc(07|12|21|32|60|\-[2-7]|i\-)|qtek|r380|r600|raks|rim9|ro(ve|zo)|s55\/|sa(ge|ma|mm|ms|ny|va)|sc(01|h\-|oo|p\-)|sdk\/|se(c(\-|0|1)|47|mc|nd|ri)|sgh\-|shar|sie(\-|m)|sk\-0|sl(45|id)|sm(al|ar|b3|it|t5)|so(ft|ny)|sp(01|h\-|v\-|v )|sy(01|mb)|t2(18|50)|t6(00|10|18)|ta(gt|lk)|tcl\-|tdg\-|tel(i|m)|tim\-|t\-mo|to(pl|sh)|ts(70|m\-|m3|m5)|tx\-9|up(\.b|g1|si)|utst|v400|v750|veri|vi(rg|te)|vk(40|5[0-3]|\-v)|vm40|voda|vulc|vx(52|53|60|61|70|80|81|83|85|98)|w3c(\-| )|webc|whit|wi(g |nc|nw)|wmlb|wonu|x700|yas\-|your|zeto|zte\-/i.test(a.substr(0,4)))check = true})(navigator.userAgent||navigator.vendor||window.opera);
    return check;
};

Second, initialization part, piece of Angular code, that you can change in your project, so I have SpeechSynthesisService($log, $q, $cookies, $timeout, commonService) with following code:

    var languagesEnglish = {
        'en-US' : {desc: 'English (United States)', voices: []},
        'en-GB' : {desc: 'English (United Kingdom)', voices: []}
    };
    // 'ru_RU' android codes
    var languagesRussian = {
        'ru-RU' : {desc: 'Russian', voices: []}
    };

    var speakerEng = {lang: 'en-US', desc: 'English (United States)', voice: null};
    var speakerRu = {lang: 'ru-RU', desc: 'Russian', voice: null};
    var englishSpeakers = [];

    if ('speechSynthesis' in window) {
        if (window.mobilecheck()) {
            englishSpeakers = prepareEnglishSpeakers();
            initVoicesJob.resolve();
        }

        window.speechSynthesis.onvoiceschanged = function() {
            if (initVoicesJob.promise.$$state.status) return;
            $log.debug(currentTime() + " InitVoice: onvoiceschanged");
            desktopInitializeVoices();
            englishSpeakers = prepareEnglishSpeakers();
            initVoicesJob.resolve();
        };
    }

So in application I have active speakerEng and active speakerRu, for mobile version the voices arrays are empty.

Third, when I want in application to say any text I call the following function:

    function sayAnyText(speaker, voiceVolume, text) {
        if (isNotSupportSpeechSynthesis()) {
            var tJob = $q.defer();
            $timeout(function() {
                tJob.resolve();
            });
            return tJob.promise;
        }

        var phrases = [];
        if (text.constructor === Array) {
            for (var ti = 0, tn = text.length; ti<tn; ti++) {
                var iPhrases = splitToPhrase(text[ti]);
                // concat arrays, fast method: https://stackoverflow.com/questions/4156101/javascript-push-array-values-into-another-array
                for (var pi = 0, pn = iPhrases.length; pi<pn; pi++) {
                    phrases.push(iPhrases[pi]);
                }
            }
        } else {
            phrases = splitToPhrase(text);
        }

        speechSynthesis.cancel();
        stopSpeechJob();
        startSpeechJob();

        for (var i = 0, n = phrases.length; i < n; i++) {
            var msg = new SpeechSynthesisUtterance();
            msg.lang = speaker.lang;
            if (speaker.voice) msg.voice = speaker.voice;
            msg.rate = 1;
            msg.volume = voiceVolume;
            msg.text = phrases[i];

            if (i + 1 == n) {
                msg.onend = function(event) {
                    $log.debug(currentTime() + "Speech ends: ", event.currentTarget.text, event);
                    stopSpeechJob();
                };
                msg.onerror = function(event) {
                    $log.error(currentTime() + "Speech ends with error: ", event.currentTarget.text, event);
                    stopSpeechJob();
                };
            } else {
                msg.onend = function(event) {
                    $log.debug(currentTime() + "Speech ends: ", event.currentTarget.text, event);
                };
                msg.onerror = function(event) {
                    $log.error(currentTime() + "Speech ends with error: ", event.currentTarget.text, event);
                };
            }

            speechSynthesis.speak(msg);
        }

        return getSpeechJob();
    }

The main piece is following:

msg.lang = speaker.lang;
if (speaker.voice) msg.voice = speaker.voice;

1.we always set a language 2.for desktop version speaker has a voice, for mobile version not.

p.s. I tested it, and it works for iOS9 (iPhone 6), Android 5.0.1 (Samsung Galaxy 4), and desktop Chrome 48.0. Feel free to ask if you have any questions or if you need more code.

like image 32
Alexey Alexeenka Avatar answered Nov 11 '22 15:11

Alexey Alexeenka