Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

generate audio file with W3C Web Speech API

Is it possible to use W3C Web Speech API to write Javascript code which generates audio file (wav, ogg or mp3) with voice speaking given text? I mean, I want to do something like:

window.speechSynthesis.speak(new SpeechSynthesisUtterance("0 1 2 3"))

but I want sound generated with it not to be output to speakers but to file.

like image 986
user983447 Avatar asked Aug 02 '16 18:08

user983447


People also ask

Is Google Text to Speech API free?

Text-to-Speech is priced based on the number of characters sent to the service to be synthesized into audio each month. You must enable billing to use Text-to-Speech, and will be automatically charged if your usage exceeds the number of free characters allowed per month.

How good is Web Speech API?

The Web Speech API is powerful and somewhat underused. However, there are a few annoying bugs and the SpeechRecognition interface is poorly supported. speechSynthesis works surprisingly well once you iron out all of its quirks and issues.


1 Answers

The requirement is not possible using Web Speech API alone, see Re: MediaStream, ArrayBuffer, Blob audio result from speak() for recording?, How to implement option to return Blob, ArrayBuffer, or AudioBuffer from window.speechSynthesis.speak() call

Though requirement is possible using a library, for example, espeak or meSpeak, see How to create or convert text to audio at chromium browser?.

fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
  .then(response => response.text())
  .then(text => {
    const script = document.createElement("script");
    script.textContent = text;
    document.body.appendChild(script);

    return Promise.all([
      new Promise(resolve => {
        meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
      }),
      new Promise(resolve => {
        meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
      })
    ])
  })
  .then(() => {
    // takes approximately 14 seconds to get here
    console.log(meSpeak.isConfigLoaded());
    console.log(meSpeak.speak("what it do my ninja", {
      amplitude: 100,
      pitch: 5,
      speed: 150,
      wordgap: 1,
      variant: "m7",
      rawdata: "mime"
    }));
})
.catch(err => console.log(err));

There is also workaround using MediaRecorder, depending on system hardware How to capture generated audio from window.speechSynthesis.speak() call?.

like image 51
guest271314 Avatar answered Oct 06 '22 23:10

guest271314