Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to insert pause in speech synthesis with grammatical hints

I am writing a simple spelling test app using the HTML5 SpeechSynthesis API. The text I would like my app to say is something like the following: "The spelling word is Cat. The cat chased the dog.".

The API tends to race without much of a pause from the first sentence to the second. I wonder if there is a way to insert a bit of a pause between the 2 sentences. I realize I could create 2 separate utterances and use the pause() call. However the code would be simpler and less brittle if I could simply insert grammatical hints.

Normally in spoken English, one tends to pause a little longer between paragraphs. So I inserted a newline character in my text, but there was no noticeable impact.

I also tried using an ellipsis.

Is there any way to do this or am I stuck breaking everything into separate utterances?

like image 575
Bob Woodley Avatar asked Jan 30 '16 17:01

Bob Woodley


4 Answers

Split your text using comma (or custom delimiter) and add your own space using a timeout.

Here is a simple example as a proof-of-concept. Extending it, you can customize your text to include hints as to how long to pause.

function speakMessage(message, PAUSE_MS = 500) {
  try {
    const messageParts = message.split(',')

    let currentIndex = 0
    const speak = (textToSpeak) => {
      const msg = new SpeechSynthesisUtterance();
      const voices = window.speechSynthesis.getVoices();
      msg.voice = voices[0];
      msg.volume = 1; // 0 to 1
      msg.rate = 1; // 0.1 to 10
      msg.pitch = .1; // 0 to 2
      msg.text = textToSpeak;
      msg.lang = 'en-US';

      msg.onend = function() {
        currentIndex++;
        if (currentIndex < messageParts.length) {
          setTimeout(() => {
            speak(messageParts[currentIndex])
          }, PAUSE_MS)
        }
      };
      speechSynthesis.speak(msg);
    }
    speak(messageParts[0])
  } catch (e) {
    console.error(e)
  }
}


function run(pause) {
  speakMessage('Testing 1,2,3', pause)
}
<button onclick='run(0)'>Speak No Pause</button>
<button onclick='run(500)'>Speak Pause</button>
<button onclick='run(1000)'>Speak Pause Longer</button>
like image 183
Steven Spungin Avatar answered Oct 23 '22 10:10

Steven Spungin


Just insert

<silence msec="5000" />

in the text for 5 sec waiting (Source).

Disclaimer: This code works only in an appropriate user agent.

// code taken from https://richjenks.com/dev/speechsynthesis/
var utterance  = new SpeechSynthesisUtterance(),
    speak      = document.getElementById("speak"),
    text       = document.getElementById("text");

// Delay links and events because speechSynthesis is funny
speechSynthesis.getVoices();
setTimeout(function () {
    // Add event listeners
    var voiceLinks = document.querySelectorAll(".voice");
    for (var i = 0; i < voiceLinks.length; i++) {
        voiceLinks[i].addEventListener("click", function (event) {
            utterance.voice = speechSynthesis.getVoices()[this.dataset.voice];
        });
    }
}, 100);

// Say text when button is clicked
speak.addEventListener("click", function (event) {
    utterance.text = text.value;
    speechSynthesis.speak(utterance);
});
<textarea id="text" rows="5" cols="50">Hi <silence msec="2000" /> Flash!</textarea>
<br>
<button id="speak">Speak</button>
like image 34
Nina Scholz Avatar answered Oct 23 '22 11:10

Nina Scholz


Using an exclamation point "!" adds a nice delay for some reason.

You can chain them together with periods to extend the pause.

"Example text! . ! . ! . !"
like image 38
Myka Avatar answered Oct 23 '22 11:10

Myka


I’ve found inserting synthetic pauses using commas to be quite useful (as an making other manipulations). Here’s a little excerpt:

var speech = new SpeechSynthesisUtterance(),
    $content = document.querySelector('main').cloneNode(true),
    $space = $content.querySelectorAll('pre'),
    $pause_before = $content.querySelectorAll('h2, h3, h4, h5, h6, p, li, dt, blockquote, pre, figure, footer'),
    $skip = $content.querySelectorAll('aside, .dont_read');

// Don’t read
$skip.forEach(function( $el ){
    $el.innerHTML = '';
});

// spacing out content
$space.forEach(function($el){
    $el.innerHTML = ' ' + $el.innerHTML.replace(/[\r\n\t]/g, ' ') + ' ';
});

// Synthetic Pauses
$pause_before.forEach(function( $el ){
    $el.innerHTML = ' , ' + $el.innerHTML;
});

speech.text = $content.textContent;

The key is to clone the content node first so you can work with it in memory rather than manipulating the actual content. It seems to work pretty well for me and I can control it in the JavaScript code rather than having to modify the page source.

like image 1
Aaron Gustafson Avatar answered Oct 23 '22 09:10

Aaron Gustafson