Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify AudioEncoding and SampleRateHertz of an audio file

I am working on Google cloud Speech-to-text samples. I took a sample from from this link GoogleCloudPlatform speech to text sample And I referred Quickstart: Using Client Libraries Sample files given in that example works fine. It gives text of that audio file. But If I give my own audio file, it does not returns anything.

Cloud request includes audio file, AudioEncoding and SampleRateHertz. Issue may be in AudioEncoding and SampleRateHertz of my own audio file.

How to identify AudioEncoding and SampleRateHertz of an audio file?

like image 580
Prasath Avatar asked Oct 19 '25 09:10

Prasath


1 Answers

AudioEncoding's Java enum has the following possible values:

AudioEncoding.AMR -> .awb/.3gp files

AudioEncoding.AMR_WB -> .awb/.3gp files

AudioEncoding.FLAC -> .flac files

AudioEncoding.LINEAR16 -> .wav files

AudioEncoding.MULAW -> .wav files

AudioEncoding.OGG_OPUS -> .ogg/.opus files

AudioEncoding.SPEEX_WITH_HEADER_BYTE -> no clue, maybe .speex

So you could make a first guess by the file extension, for the SampleRateHertz you could use a tool like Tika by Apache. This outputs for the commercial_stereo.wav the following:

Content-Length: 6305632
Content-Type: audio/vnd.wave
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.audio.AudioParser
X-TIKA:digest:MD5: 7e3e8837273e8bb143533894926f7da3
X-TIKA:digest:SHA256: 98fac004fb662ad8f720e680c81e3b4c9dea20190f5d1d908cece2cd6b30f01e
bits: 16
channels: 2
encoding: PCM_SIGNED
resourceName: commercial_stereo.wav
samplerate: 44100.0
xmpDM:audioSampleRate: 44100
xmpDM:audioSampleType: 16Int
like image 59
maio290 Avatar answered Oct 21 '25 23:10

maio290