Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use CMU Sphinx 4 for speech to text with english voxforge models

I'm trying to figure out how to use sphinx4 or pocketsphinx with the english voxforge model but I can't get it working. I have tried to read doc pages (like this one http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html ) but it does not help me.

What I want is an executable where I can specify which model to use and which audio file to use as source and have the executable print out it's best guess about what the voice on the recording says.

I hade some luck with: pocketsphinx_continuous -infile recording.wav 2> /dev/null

But it aborts before the complete audio file is transcribed and the default model has waay to few words to create a readable text from the audio.

I have compiled and tested the demos in sphinx4 source package but all the examples seem to have to few words and needs a model loke the voxforge one to be useful to me.

How can I set this up?

like image 959
tirithen Avatar asked Dec 31 '11 00:12

tirithen


People also ask

How do you use the Sphinx 4 in Java?

As any library in Java all you need to do to use sphinx4 is to add the jars to the dependencies of your project and then you can write code using the API. The easiest way to use sphinx4 is to use modern build tools like Apache Maven or Gradle. Sphinx-4 is available as a maven package in the Sonatype OSS repository.

What is Pocketsphinx?

PocketSphinx is a library that depends on another library called SphinxBase which provides common functionality across all CMUSphinx projects. To install Pocketsphinx, you need to install both Pocketsphinx and Sphinxbase. You can use Pocketsphinx with Linux, Windows, on MacOS, iPhone and Android.


1 Answers

It's very simple to plug in Voxforge acoustic model. The main document covering the API is cmusphinx tutorial:

http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4

It's recommended to read it before you start. Please also note that it is recommended to use En_US English Generic acoustic model, it is more accurate than Voxforge.

Step by step you need to do the following:

  • Download voxforge model from sourceforge and unpack it to a folder
  • Checkout sphinx4 from github and build it with gradle
  • Run TranscriberDemo
  • Go to sphinx4-samples/src/main/java/edu/cmu/sphinx/demo/transcriber folder, open Transcriber demo and edit the acoustic model path as below.
  • Edit the location of the audio file in sources if you need another audio file
  • Run demo again and enjoy

That would be it

   // Load model from the folder in your project
   configuration.setAcousticModelPath("file:voxforge-en-0.4/model_parameters/voxforge_en_sphinx.cd_cont_5000");
like image 146
Nikolay Shmyrev Avatar answered Sep 17 '22 01:09

Nikolay Shmyrev