How to use Google's Text-to-Speech API in Python

Question

My key is ready to go to make requests and get speech from text from Google.
I tried these commands and many more.
The docs offer no straight forward solutions to getting started with Python that I've found. I don't know where my API key goes along with the JSON and URL

One solution in their docs here is for CURL.. But involves downloading a txt after the request that has to be sent back to them in order to get the file. Is there a way to do this in Python that doesn't involve the txt I have to return them? I just want my list of strings returned as audio files.

My Code

(I put my actual key in the block above. I'm just not going to share it here.)

CodeRaptor · Accepted Answer

Configure Python App for JSON file and Install Client Library

Create a Service Account
Create a Service Account Key using the Service Account here
The JSON file downloads and save it securely
Include the Google Application Credentials in your Python App
Install the library: pip install --upgrade google-cloud-texttospeech

Using Google's Python examples found: https://cloud.google.com/text-to-speech/docs/reference/libraries Note: In Google's example it is not including the name parameter correctly. and https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/texttospeech/cloud-client/quickstart.py

Below is the modified from the example using google app credentials and wavenet voice of a female.

os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/yourproject-12345.json"

from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Do no evil!")

# Build the voice request, select the language code ("en-US") 
# ****** the NAME
# and the ssml voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
    language_code='en-US',
    name='en-US-Wavenet-C',
    ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)

# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
    audio_encoding=texttospeech.enums.AudioEncoding.MP3)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)

# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')

Voices,Name, Language Code, SSML Gender, Etc

List of Voices: https://cloud.google.com/text-to-speech/docs/voices

In the above code example I changed the voice from Google's example code to include the name parameter and to use the Wavenet voice (much improved but more expensive $16/million chars) and the SSML Gender to FEMALE.

voice = texttospeech.types.VoiceSelectionParams(
        language_code='en-US',
        name='en-US-Wavenet-C',
        ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)

How to use Google's Text-to-Speech API in Python

Tags:

python

api

text-to-speech

Ant

1 Answers

Configure Python App for JSON file and Install Client Library

Below is the modified from the example using google app credentials and wavenet voice of a female.

Voices,Name, Language Code, SSML Gender, Etc

CodeRaptor

Recent Activity

Donate For Us

How to use Google's Text-to-Speech API in Python

Tags:

python

api

text-to-speech

Ant

1 Answers

Configure Python App for JSON file and Install Client Library

Below is the modified from the example using google app credentials and wavenet voice of a female.

Voices,Name, Language Code, SSML Gender, Etc

CodeRaptor

Related questions

Recent Activity

Donate For Us