Im using Python 3.8 and i copy pasted this code as a test.
from google.cloud import texttospeech
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")
# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)
# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3
)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
input=synthesis_input, voice=voice, audio_config=audio_config
)
# The response's audio_content is binary.
with open("output.mp3", "wb") as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
This is the code that is shown by google as can be seen here : GOOGLE LINK
Now my problem is that i get this error
PS C:\Users\User\Desktop> & C:/Users/User/AppData/Local/Programs/Python/Python38/python.exe "c:/Users/User/Desktop/from google.cloud import texttospeech.py"
Traceback (most recent call last):
File "c:/Users/User/Desktop/from google.cloud import texttospeech.py", line 7, in <module>
synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")
AttributeError: module 'google.cloud.texttospeech' has no attribute 'types'
PS C:\Users\User\Desktop>
I tried changeing this to add the credentials inside the code but the problem persists. This is the line i changed:
client = texttospeech.TextToSpeechClient(credentials="VoiceAutomated-239f1c05600c.json")
On an Android phone, tap Settings (the Gear icon) and then tap Accessibility > Select to Speak. Tap the Select to Speak toggle switch to turn on the feature. Select OK to confirm permissions. Open any app, and then tap Select to Speak > Play to hear the phone read the text aloud.
Go to Settings. Edit System Services - Voice. Set Google Cloud as Default Text-to-Speech. Choose your preferred Default Voice for your setup.
The Text-to-Speech API enables developers to generate human-like speech. The API converts text into audio formats such as WAV, MP3, or Ogg Opus. It also supports Speech Synthesis Markup Language (SSML) inputs to specify pauses, numbers, date and time formatting, and other pronunciation instructions.
You can find it on your Apps menu. Select a language to read out your text. Tap the language drop-down near the top, and select your language from the list. Enter the text you want to record.
from google.cloud import texttospeech # Instantiates a client client = texttospeech.TextToSpeechClient () # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput (text="Hello, World!")
So you are probably wondering what is the difference between Google Cloud’s text to speech versus any other text to speech out there. Well for one it converts text into a more natural-sounding speech thanks to the power of artificial intelligence (These are called the wavenet voices).
# Instantiates a client client = texttospeech.TextToSpeechClient () # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput (text="Hello, World!")
To interact with the Text-to-speech API, you are required to create a service account in the cloud console. Follow the steps below. You can also get instructions on this page. In the Cloud Console, go to the Create service account page. Create a project. You can also select an existing one. Enable the Cloud Text-to-Speech API for the project.
I could solve this error by downgrading the library: pip3 install "google-cloud-texttospeech<2.0.0"
I got the same error when running that script, i checked the source code and the interface has changed, basically you need to delete all "enums" and "types". It will look similar to this:
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")
# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
language_code='en-US',
ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL)
# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(input=synthesis_input, voice=voice, audio_config=audio_config)
# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
I debug the code and to get it to work i had to write enums and types when needed. Taking the text to speech google documentation example and including some little adjusments:
"""Synthesizes speech from the input string of text or ssml.
Note: ssml must be well-formed according to:
https://www.w3.org/TR/speech-synthesis/
"""
from google.cloud import texttospeech
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "./config/credentials.json"
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")
# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
language_code="en-US", ssml_gender=texttospeech.enums.SsmlVoiceGender.NEUTRAL
)
# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3
)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
input_=synthesis_input, voice=voice, audio_config=audio_config
)
# The response's audio_content is binary.
with open("./output_tts/output.mp3", "wb") as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
hope this works for you
It will work Python 3.6 but it won't work with Python 3.7 with latest update of google-cloud-texttospeech. If you want us it with Python 3.7 Try the below code.
from google.cloud import texttospeech
def foo():
client = texttospeech.TextToSpeechClient(credentials=your_google_creds_here)
translated_text = Text
synthesis_input = texttospeech.types.SynthesisInput(text=translated_text)
pitch = 1
speaking_rate = 1
lang_code = 'en-us' # your_lang_code_hear
gender = 'male'
gender_data = {
'NEUTRAL': texttospeech.enums.SsmlVoiceGender.NEUTRAL,
'FEMALE': texttospeech.enums.SsmlVoiceGender.FEMALE,
'MALE': texttospeech.enums.SsmlVoiceGender.MALE
}
voice = texttospeech.types.VoiceSelectionParams(language_code=lang_code, ssml_gender=gender_data[gender.upper()])
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3, speaking_rate=float(speaking_rate), pitch=float(pitch)
)
print('Voice config and Audio config : ', voice, audio_config)
response = client.synthesize_speech(
synthesis_input, voice, audio_config)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With