Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is Google Cloud API for TTS (Text to Speech)? [closed]

In my webapp, I'm trying to call make an HTTP request to a Google API which takes some text (such as "Hello World") and returns a MP3 file with the speech equivalent.

I have seen this question: Google text to speech tts api doesn't seems to work. And this google page: https://cloud.google.com/translate/docs/.

And there are lots of other pages that seem out of date -- it looks like this feature has been removed by google or is under a different rest call?

I don't see any documentation (such as in Google Translate API https://cloud.google.com/translate/) on how to call the google api for TTS. I have a google cloud API account and key.

Thanks, Dan

like image 407
Dan Avatar asked Aug 22 '16 19:08

Dan


People also ask

What is Google Text to Speech API?

Google Cloud Text-to-Speech API (Beta) allows developers to include natural-sounding, synthetic human speech as playable audio in their applications. The Text-to-Speech API converts text or Speech Synthesis Markup Language (SSML) input into audio data like MP3 or LINEAR16 (the encoding used in WAV files).

How does Google Cloud Speech API work?

A streaming Speech-to-Text API recognition call is designed for real-time capture and recognition of audio, within a bi-directional stream. Your application can send audio on the request stream, and receive interim and final recognition results on the response stream in real time.

What is Google text to speech used for?

Speech Services is a screen reader application developed by Google for its Android operating system. It powers applications to read aloud (speak) the text on the screen with support for many languages.

What is Google Cloud Text-To-Speech API?

The Google Cloud Text-to-Speech API converts text input into audio data of human-like speech in more than 100 voices across more than 20 languages. With the API, developers can create interactions with users that are aimed to feel more lifelike. This API uses RESTful calls although there is a gRPC version of the API also available.

What is Google Cloud Text-to-speech?

Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 100+ voices, available in multiple languages and variants. It applies DeepMind’s groundbreaking research in...

What is the text-to-Speech API?

The Text-to-Speech API enables developers to generate human-like speech. The API converts text into audio formats such as WAV, MP3, or Ogg Opus. It also supports Speech Synthesis Markup Language (SSML) inputs to specify pauses, numbers, date and time formatting, and other pronunciation instructions.

What is the difference between Google Translate and Google Text-to-speech?

Google Text-to-Speech is a screen reader application developed and available on the Android platform, it is currently not available as part of the Google Cloud Platform. On the other hand, Google Translate is split between a website add-on and a web based application possesing a feature called “Listen”.


2 Answers

Google Text-to-Speech is a screen reader application developed and available on the Android platform, it is currently not available as part of the Google Cloud Platform.

On the other hand, Google Translate is split between a website add-on and a web based application possesing a feature called “Listen”. This feature can be used to play via audito the output of the translation, but it is currently not possible to download it in MP3 format.

It is important not to confuse the Cloud Translation API available as part of the Cloud Platform and serving to translate text-based input from one supported language to another.

Lastly, if you are interested to see this type of API available as part of the Google Cloud Platform, you can submit a new Feature Request Issue on this Google Public Issue Tracker.

like image 52
Alex Avatar answered Oct 17 '22 22:10

Alex


Google recently published Google Cloud Text To Speech API.

.NET Client version of Google.Cloud.TextToSpeech can be found here: https://github.com/jhabjan/Google.Cloud.TextToSpeech.V1

Here is short example how to use the client:

GoogleCredential credentials =
    GoogleCredential.FromFile(Path.Combine(Program.AppPath, "jhabjan-test-47a56894d458.json"));

TextToSpeechClient client = TextToSpeechClient.Create(credentials);

SynthesizeSpeechResponse response = client.SynthesizeSpeech(
    new SynthesisInput()
    {
        Text = "Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 32 voices"
    },
    new VoiceSelectionParams()
    {
        LanguageCode = "en-US",
        Name = "en-US-Wavenet-C"
    },
    new AudioConfig()
    {
        AudioEncoding = AudioEncoding.Mp3
    }
);

string speechFile = Path.Combine(Directory.GetCurrentDirectory(), "sample.mp3");

File.WriteAllBytes(speechFile, response.AudioContent);
like image 29
HABJAN Avatar answered Oct 17 '22 21:10

HABJAN