Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to play AudioStream response in AWS Polly using JavaScript SDK?

This is my script:

<script src="https://sdk.amazonaws.com/js/aws-sdk-2.7.13.min.js"></script>
<script>
    AWS.config.region = 'eu-west-1';
    AWS.config.accessKeyId = 'FOO';
    AWS.config.secretAccessKey = 'BAR';

    var polly = new AWS.Polly({apiVersion: '2016-06-10'});

    var params = {
        OutputFormat: 'mp3', /* required */
        Text: 'Hello world', /* required */
        VoiceId: 'Joanna', /* required */
        SampleRate: '22050',
        TextType: 'text'
    };

    polly.synthesizeSpeech(params, function(err, data) {
        if (err) console.log(err, err.stack); // an error occurred
        else     console.log(data);           // successful response
    });
</script>

The request succeeds, and I get this kind of response:

enter image description here

How do I use this kind of response? I understand that the response is deserialized audio, but how do I actually play it, say, inside a HTML5 audio element?

Furthermore, this answer on SO explains why is this type of array suitable for audio data: https://stackoverflow.com/a/26320913/1325575

like image 887
The Onin Avatar asked Dec 10 '16 00:12

The Onin


3 Answers

Using the Web Audio API:

const result = await polly.synthesizeSpeech(params).promise();

const aContext = new AudioContext();

const source = aContext.createBufferSource();
source.buffer = await aContext.decodeAudioData(result.AudioStream.buffer);
source.connect(aContext.destination);
source.start();

Docs:

  • AudioContext
  • Decode ArrayBuffer
like image 152
EnverOsmanov Avatar answered Nov 15 '22 04:11

EnverOsmanov


 var uInt8Array = new Uint8Array(audioStream);
 var arrayBuffer = uInt8Array.buffer;
 var blob = new Blob([arrayBuffer]);
 var url = URL.createObjectURL(blob);

 audioElement.src = url;
 audioElement.play();

I created a Javascript library called ChattyKathy that will handle the entire process for you if you want to take the easy way out.

Just pass it an AWS Credentials object and then tell her what to say. She'll call AWS, transform the response, and play the audio.

var settings = {
    awsCredentials: awsCredentials,
    awsRegion: "us-west-2",
    pollyVoiceId: "Justin",
    cacheSpeech: true
}

var kathy = ChattyKathy(settings);

kathy.Speak("Hello world, my name is Kathy!");
kathy.Speak("I can be used for an amazing user experience!");
like image 45
Elliott Avatar answered Nov 15 '22 04:11

Elliott


Elliott's Chatty Kathy code worked beautifully for me, but there are two separate issues with Safari and mobile.

Safari: When creating the blob, the content type MUST be specified:

var blob = new Blob([arrayBuffer], {type: 'audio/mpeg'});
url = webkitURL.createObjectURL(blob);

Mobile: The above must be true, plus playback needs to be initiated by a user touch event. Note: Older iOS versions seem to require that playback be initiated in the same thread as the touch event, so a touch event that initiates a promise chain that eventually calls audio.play() will fail. Later iOS versions seem to be smarter about this.

like image 41
fuzzy marmot Avatar answered Nov 15 '22 02:11

fuzzy marmot