Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert linear16 text-to-speech to audio file

I just started to play with Google Text-To-Speech API. I generated a post request to:

https://texttospeech.googleapis.com/v1/text:synthesize?fields=audioContent&key={YOUR_API_KEY}

with the following data:

{
 "input": {
  "text": "Hola esto es una prueba"
},
 "voice": {
  "languageCode": "es-419"
 },
 "audioConfig": {
  "audioEncoding": "LINEAR16",
  "speakingRate": 1,
  "pitch": 0
 }
}

and I got a 200 response, with the content:

{
    "audioContent" : "UklGRn6iCwBXQVZFZm10I...(super long string)"
}

I am assuming this is encoded (or decoded, not sure about the naming), but I would like to actually hear what is that "audioContent".

like image 528
Simon Ernesto Cardenas Zarate Avatar asked Sep 02 '25 03:09

Simon Ernesto Cardenas Zarate


1 Answers

As Tanaike pointed out, the response is indeed Base64. To actually listen the audio, I pasted the base64 encoded string into a file, then ran:

base64 -d audio.txt > audio.wav

and that made the trick.

like image 162
Simon Ernesto Cardenas Zarate Avatar answered Sep 05 '25 00:09

Simon Ernesto Cardenas Zarate