Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google-speech-api transcribing spoken numbers incorrectly

I started using google speech api to transcribe audio.

The audio being transcribed contains many numbers spoken one after the other.

E.g. 273 298

But the transcription comes back 270-3298

My guess is that it is interpreting it as some sort of phone number.

What i want is unparsed output e.g. "two seventy three two ninety eight' which i can deal with and parse on my own.

Is there a setting or support for this kind of thing?

thanks

like image 626
Moshe Rayman Avatar asked Oct 06 '16 10:10

Moshe Rayman


3 Answers

So I had this exact same problem and I think we found a solution. If you're using English as input, switch to en-PH just when working with numbers. Google will then not format the result as a U.S. phone number or try to stick an extra digit in there.

like image 64
Sam Avatar answered Oct 17 '22 01:10

Sam


Try passing a speech context with some phrase hints. How to use it is documented here: https://cloud.google.com/speech/docs/basics#phrase-hints

Give it the spelled out numbers that you want recognized.

"speech_context": {
  "phrases":["zero", "one", "two", ... "nine", "ten", "eleven", ... "twenty", "thirty,..., "ninety"]
 }

This isn't guaranteed to work, but it may help.

like image 36
blambert Avatar answered Oct 17 '22 02:10

blambert


For the record, I tried blambert's solution above and it doesn't work, unfortunately. I posted another question recently seeing if anyone has found a way to defeat this behavior, as it is preventing me from implementing a transcription service that I had planned.

like image 1
justishar Avatar answered Oct 17 '22 00:10

justishar