Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to correctly specify SSML in an Alexa Skill lambda function?

I am trying to make an Alexa skill where Alexa says something that has been marked up with SSML. I have tried to mimic the example in this repo, but I am always receiving a lambda response of

{
  ...
  "response": {
    "outputSpeech": {
      "type": "SSML",
      "ssml": "<speak> [object Object] </speak>"
    },
  ...
}

and Alexa literally says "object object".


This is what I input to my lambda function (using node.js):

var speechOutput = {
    type: "SSML",
    ssml: 'This <break time=\"0.3s\" /> is not working',
};

this.emit(':tellWithCard', speechOutput, SKILL_NAME, "ya best not repeat after me.")

Setting speechOutput like this also isn't working:

var speechOutput = {
    type: "SSML",
    ssml: 'This <break time=\"0.3s\" /> is not working',
};


EDIT:

index.js

'use strict';

var Alexa = require('alexa-sdk');

var APP_ID = "MY_ID_HERE";
var SKILL_NAME = "MY_SKILL_NAME";

exports.handler = function(event, context, callback) {
    var alexa = Alexa.handler(event, context);
    alexa.APP_ID = APP_ID;
    alexa.registerHandlers(handlers);
    alexa.execute();
};

var handlers = {
    'LaunchRequest': function () {
        this.emit('Speaketh');
    },
    'MyIntent': function () {
        this.emit('Speaketh');
    },
    'Speaketh': function () {
        var speechOutput = {
            type: "SSML",
            ssml: 'This <break time=\"0.3s\" /> is not working',
        };

        this.emit(':tellWithCard', speechOutput, SKILL_NAME, "some text here")
    }
};

Anyone have any idea where I'm going wrong?

like image 928
David Baker Avatar asked Jan 21 '17 05:01

David Baker


People also ask

What is Alexa Ssml?

The Alexa Skills Kit provides this type of control with Speech Synthesis Markup Language (SSML) support. SSML is a markup language that provides a standard way to mark up text for the generation of synthetic speech. The Alexa Skills Kit supports a subset of the tags defined in the SSML specification.

What does Speech Synthesis Markup Language Ssml do in Alexa skills?

Speech Synthesis Markup Language, or SSML, is a standardized markup language that provides a way to markup text for changing how speech is synthesized. Numerous SSML tags are currently supported by the Alexa Skills Kit including: speak, p, s, break, say-as, phoneme, w and audio.

Does Alexa use lambda?

You can use Lambda functions to build services that give new skills to Alexa, the Voice assistant on Amazon Echo. The Alexa Skills Kit provides the APIs, tools, and documentation to create these new skills, powered by your own services running as Lambda functions.


2 Answers

Per the alexa-sdk source code for response.js on GitHub, the speechOutput object in your code is expected to be a string. Response.js is responsible for building the response object you're trying to build in your code:

this.handler.response = buildSpeechletResponse({
    sessionAttributes: this.attributes,
    output: getSSMLResponse(speechOutput),
    shouldEndSession: true
});

Digging deeper, buildSpeechletResponse() invokes createSpeechObject(), which is directly responsible for creating the outputSpeech object in the Alexa Skills Kit response.

So for simple responses with no advanced SSML functionality, just send a string as that first parameter on :tell and let alexa-sdk handle it from there.


For advanced ssml functionality, like pauses, give the ssml-builder npm package a look. It allows you to wrap your response content in SSML without having to implement or hardcode an SSML parser yourself.

Example usage:

var speech = new Speech();

speech.say('This is a test response & works great!');
speech.pause('100ms');
speech.say('How can I help you?');    
var speechOutput = speech.ssml(true);        
this.emit(':ask', speechOutput , speechOutput); 

This example emits an ask response where both the speech output and the reprompt speech are set to the same value. SSML Builder will correctly parse the ampersand (which is an invalid character in SSML) and inject a pause 100ms pause in-between the two say statements.

Example response:

Alexa Skills Kit will emit the following response object for the code above:

{
  "outputSpeech": {
    "type": "SSML",
    "ssml": "<speak> This is a test response and works great! <break time='100ms'/> How can I help you? </speak>"
  },
  "shouldEndSession": false,
  "reprompt": {
    "outputSpeech": {
      "type": "SSML",
      "ssml": "<speak> This is a test response and works great! <break time='100ms'/> How can I help you? </speak>"
    }
  }
}
like image 72
Anthony Neace Avatar answered Sep 28 '22 03:09

Anthony Neace


It is an old question but I recently had a similar problem and wanted to contribute with an answer which doesn't need extra dependencies.

As mentioned, speechOutput suppose to be a string so the reason alexa says "object object" is because instead it is a json.

Trying your handler as follows

'Speaketh': function () {
    var speechOutput = 'This <break time="0.3s" /> should work';

    this.emit(':tellWithCard', speechOutput, SKILL_NAME, "some text here")
}

returns this response

{ 
  ...
  "response": {
    "outputSpeech": {
    "ssml": "<speak> This <break time=\"0.3s\" /> should work </speak>",
    "type": "SSML"
  },
  ...
}
like image 39
Edwin Avatar answered Sep 28 '22 01:09

Edwin