This is my script:
<script src="https://sdk.amazonaws.com/js/aws-sdk-2.7.13.min.js"></script>
<script>
AWS.config.region = 'eu-west-1';
AWS.config.accessKeyId = 'FOO';
AWS.config.secretAccessKey = 'BAR';
var polly = new AWS.Polly({apiVersion: '2016-06-10'});
var params = {
OutputFormat: 'mp3', /* required */
Text: 'Hello world', /* required */
VoiceId: 'Joanna', /* required */
SampleRate: '22050',
TextType: 'text'
};
polly.synthesizeSpeech(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
</script>
The request succeeds, and I get this kind of response:
How do I use this kind of response? I understand that the response is deserialized audio, but how do I actually play it, say, inside a HTML5 audio element?
Furthermore, this answer on SO explains why is this type of array suitable for audio data: https://stackoverflow.com/a/26320913/1325575
Using the Web Audio API:
const result = await polly.synthesizeSpeech(params).promise();
const aContext = new AudioContext();
const source = aContext.createBufferSource();
source.buffer = await aContext.decodeAudioData(result.AudioStream.buffer);
source.connect(aContext.destination);
source.start();
Docs:
var uInt8Array = new Uint8Array(audioStream);
var arrayBuffer = uInt8Array.buffer;
var blob = new Blob([arrayBuffer]);
var url = URL.createObjectURL(blob);
audioElement.src = url;
audioElement.play();
I created a Javascript library called ChattyKathy that will handle the entire process for you if you want to take the easy way out.
Just pass it an AWS Credentials object and then tell her what to say. She'll call AWS, transform the response, and play the audio.
var settings = {
awsCredentials: awsCredentials,
awsRegion: "us-west-2",
pollyVoiceId: "Justin",
cacheSpeech: true
}
var kathy = ChattyKathy(settings);
kathy.Speak("Hello world, my name is Kathy!");
kathy.Speak("I can be used for an amazing user experience!");
Elliott's Chatty Kathy code worked beautifully for me, but there are two separate issues with Safari and mobile.
Safari: When creating the blob, the content type MUST be specified:
var blob = new Blob([arrayBuffer], {type: 'audio/mpeg'});
url = webkitURL.createObjectURL(blob);
Mobile: The above must be true, plus playback needs to be initiated by a user touch event. Note: Older iOS versions seem to require that playback be initiated in the same thread as the touch event, so a touch event that initiates a promise chain that eventually calls audio.play() will fail. Later iOS versions seem to be smarter about this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With