Twilio can provide call recording, but that's not real-time. Is it possible to write an app that processes the caller's audio in real-time and responds after processing the audio? I'd like to have some software "listen" to the speaker and respond programmatically.
When your phone number receives an incoming call, Twilio will send an HTTP request to your server at /answer. Your app tells Twilio how to respond with a text to speech response. Twilio gets the instructions from your app and sends the voice response. Twilio's always there when you call!
You can use a Function to make a call from your Twilio phone number via Programmable Voice. The to and from parameters of your call must be specified to successfully send, and valid TwiML must be provided either via the url or twiml parameters.
When using <Say> , you can choose between using Man, Woman, Alice, or Amazon Polly voices. To use one of these voices, either configure the Text to Speech settings in the Twilio Console, or provide the Voice attribute on <Say> . You can view and change the default voice in the Twilio Console.
Two years later, Twilio has released the use case I was trying to do on my own. They have a real-time speech recognition service built into Programmable Voice now. It's in public beta: https://www.twilio.com/blog/2017/05/introducing-speech-recognition.html
For people still looking, Twilio now has Voice Streams that covers this use case ! It's a twiml verb that will communicate the audio through websocket to your server.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With