Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I can't seem to make the google.cloud.texttospeech to work

Im using Python 3.8 and i copy pasted this code as a test.

from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
    language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)

# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
    audio_encoding=texttospeech.AudioEncoding.MP3
)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
    input=synthesis_input, voice=voice, audio_config=audio_config
)

# The response's audio_content is binary.
with open("output.mp3", "wb") as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')

This is the code that is shown by google as can be seen here : GOOGLE LINK

Now my problem is that i get this error

PS C:\Users\User\Desktop> & C:/Users/User/AppData/Local/Programs/Python/Python38/python.exe "c:/Users/User/Desktop/from google.cloud import texttospeech.py"
Traceback (most recent call last):
  File "c:/Users/User/Desktop/from google.cloud import texttospeech.py", line 7, in <module>
    synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")
AttributeError: module 'google.cloud.texttospeech' has no attribute 'types'
PS C:\Users\User\Desktop>

I tried changeing this to add the credentials inside the code but the problem persists. This is the line i changed:

client = texttospeech.TextToSpeechClient(credentials="VoiceAutomated-239f1c05600c.json")
like image 440
user655941 Avatar asked Jun 22 '20 20:06

user655941


People also ask

How does Google text-to-speech work?

On an Android phone, tap Settings (the Gear icon) and then tap Accessibility > Select to Speak. Tap the Select to Speak toggle switch to turn on the feature. Select OK to confirm permissions. Open any app, and then tap Select to Speak > Play to hear the phone read the text aloud.

How do I use text-to-speech in cloud?

Go to Settings. Edit System Services - Voice. Set Google Cloud as Default Text-to-Speech. Choose your preferred Default Voice for your setup.

How does Google Text-to-Speech API work?

The Text-to-Speech API enables developers to generate human-like speech. The API converts text into audio formats such as WAV, MP3, or Ogg Opus. It also supports Speech Synthesis Markup Language (SSML) inputs to specify pauses, numbers, date and time formatting, and other pronunciation instructions.

How do I record Google text-to-speech?

You can find it on your Apps menu. Select a language to read out your text. Tap the language drop-down near the top, and select your language from the list. Enter the text you want to record.

How do I import TextToSpeech from Google Cloud?

from google.cloud import texttospeech # Instantiates a client client = texttospeech.TextToSpeechClient () # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput (text="Hello, World!")

What is the difference between Google Cloud’s text to speech?

So you are probably wondering what is the difference between Google Cloud’s text to speech versus any other text to speech out there. Well for one it converts text into a more natural-sounding speech thanks to the power of artificial intelligence (These are called the wavenet voices).

How to synthesize text from a client client in TextToSpeech?

# Instantiates a client client = texttospeech.TextToSpeechClient () # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput (text="Hello, World!")

How to interact with the text-to-Speech API?

To interact with the Text-to-speech API, you are required to create a service account in the cloud console. Follow the steps below. You can also get instructions on this page. In the Cloud Console, go to the Create service account page. Create a project. You can also select an existing one. Enable the Cloud Text-to-Speech API for the project.


Video Answer


4 Answers

I could solve this error by downgrading the library:
pip3 install "google-cloud-texttospeech<2.0.0"

like image 132
Daniel Avatar answered Oct 17 '22 19:10

Daniel


I got the same error when running that script, i checked the source code and the interface has changed, basically you need to delete all "enums" and "types". It will look similar to this:

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
    language_code='en-US',
    ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL)

# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
    audio_encoding=texttospeech.AudioEncoding.MP3)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(input=synthesis_input, voice=voice, audio_config=audio_config)

# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')
like image 28
gamerkore Avatar answered Oct 17 '22 20:10

gamerkore


I debug the code and to get it to work i had to write enums and types when needed. Taking the text to speech google documentation example and including some little adjusments:

"""Synthesizes speech from the input string of text or ssml.

Note: ssml must be well-formed according to:
    https://www.w3.org/TR/speech-synthesis/
"""
from google.cloud import texttospeech
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "./config/credentials.json"

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
    language_code="en-US", ssml_gender=texttospeech.enums.SsmlVoiceGender.NEUTRAL
)

# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
    audio_encoding=texttospeech.enums.AudioEncoding.MP3
)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
    input_=synthesis_input, voice=voice, audio_config=audio_config
)

# The response's audio_content is binary.
with open("./output_tts/output.mp3", "wb") as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')

hope this works for you

like image 2
jccastiblancor Avatar answered Oct 17 '22 20:10

jccastiblancor


It will work Python 3.6 but it won't work with Python 3.7 with latest update of google-cloud-texttospeech. If you want us it with Python 3.7 Try the below code.

from google.cloud import texttospeech
def foo():
    client = texttospeech.TextToSpeechClient(credentials=your_google_creds_here)
    translated_text = Text


    synthesis_input = texttospeech.types.SynthesisInput(text=translated_text)
    pitch = 1
    speaking_rate = 1
    lang_code = 'en-us' # your_lang_code_hear
    gender = 'male'

    gender_data = {
                    'NEUTRAL': texttospeech.enums.SsmlVoiceGender.NEUTRAL,
                    'FEMALE': texttospeech.enums.SsmlVoiceGender.FEMALE,
                    'MALE': texttospeech.enums.SsmlVoiceGender.MALE
                }

    voice = texttospeech.types.VoiceSelectionParams(language_code=lang_code, ssml_gender=gender_data[gender.upper()])
    audio_config = texttospeech.types.AudioConfig(
        audio_encoding=texttospeech.enums.AudioEncoding.MP3, speaking_rate=float(speaking_rate), pitch=float(pitch)
    )

    print('Voice config and Audio config : ', voice, audio_config)
    response = client.synthesize_speech(
        synthesis_input, voice, audio_config)
like image 1
Mithlesh kumar Avatar answered Oct 17 '22 19:10

Mithlesh kumar