How to identify AudioEncoding and SampleRateHertz of an audio file

Question

I am working on Google cloud Speech-to-text samples. I took a sample from from this link GoogleCloudPlatform speech to text sample And I referred Quickstart: Using Client Libraries Sample files given in that example works fine. It gives text of that audio file. But If I give my own audio file, it does not returns anything.

Cloud request includes audio file, AudioEncoding and SampleRateHertz. Issue may be in AudioEncoding and SampleRateHertz of my own audio file.

How to identify AudioEncoding and SampleRateHertz of an audio file?

maio290 · Accepted Answer

AudioEncoding's Java enum has the following possible values:

AudioEncoding.AMR -> .awb/.3gp files

AudioEncoding.AMR_WB -> .awb/.3gp files

AudioEncoding.FLAC -> .flac files

AudioEncoding.LINEAR16 -> .wav files

AudioEncoding.MULAW -> .wav files

AudioEncoding.OGG_OPUS -> .ogg/.opus files

AudioEncoding.SPEEX_WITH_HEADER_BYTE -> no clue, maybe .speex

So you could make a first guess by the file extension, for the SampleRateHertz you could use a tool like Tika by Apache. This outputs for the commercial_stereo.wav the following:

Content-Length: 6305632
Content-Type: audio/vnd.wave
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.audio.AudioParser
X-TIKA:digest:MD5: 7e3e8837273e8bb143533894926f7da3
X-TIKA:digest:SHA256: 98fac004fb662ad8f720e680c81e3b4c9dea20190f5d1d908cece2cd6b30f01e
bits: 16
channels: 2
encoding: PCM_SIGNED
resourceName: commercial_stereo.wav
samplerate: 44100.0
xmpDM:audioSampleRate: 44100
xmpDM:audioSampleType: 16Int

How to identify AudioEncoding and SampleRateHertz of an audio file

Tags:

java

google-cloud-platform

speech-to-text

Prasath

1 Answers

maio290

Recent Activity

Donate For Us

How to identify AudioEncoding and SampleRateHertz of an audio file

Tags:

java

google-cloud-platform

speech-to-text

Prasath

1 Answers

maio290

Related questions

Recent Activity

Donate For Us