Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use CMU Sphinx speech recognition with Ruby application?

I'm searching for how to use CMU Sphinx with Ruby (Rails) application. I need very simple task - I have an mp3 file and I want get it transcribed into text.

How can I implement this easiest way? I dont' know C/C++ and my task isn't so big to learn C/C++ for it :)

Thanks for help!

like image 294
Alve Avatar asked Nov 08 '12 17:11

Alve


People also ask

Does Pocketsphinx work offline?

If online speech recognition is enough for you (you have access to the Internet), use the SpeechRecognition library with Google API. Go to Online Speech Recognition with SpeechRecognition and Google API section. If you need offline English speech recognition, you can install the Vosk library OR Pocketsphinx.

Is CMUSphinx open source?

A fast performance-oriented recognizer, originally developed by Xuedong Huang at Carnegie Mellon and released as open-source with a BSD-style license on SourceForge by Kevin Lenzo at LinuxWorld in 2000.

How do I convert text to speech in Python?

Translation of Speech to Text: First, we need to import the library and then initialize it using init() function. This function may take 2 arguments. After initialization, we will make the program speak the text using say() function. This method may also take 2 arguments.


1 Answers

CMUSphinx provides several interfaces you can use to leverage speech recognition features. Some of them might be more suitable for you, some less:

  1. Use command-line tools and execute them as an external binaries from Rails application to obtain the required results. The tool to execute is pocketsphinx_continuous. For more information on how to invoke binaries from Rails see the question: how to execute binary on heroku?

  2. You can invoke Sphinx4 framework from JRuby using JVM, see for example on how to use Sphinx4 from JRuby: http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4#writing_scripts

  3. You can implement the pocketsphinx bindings using SWIG. The easy part is that the swig wrappers for Python already exist as a part of the pocketsphinx, you just need to use SWIG to generate Ruby wrappers: https://sourceforge.net/p/cmusphinx/code/11643/tree/trunk/pocketsphinx/swig/

  4. Finally, you can implement a REST web using Java REST frameworks to convert audio to text using CMUSphinx tools and invoke the sevice from your Ruby code. For more information see information how to use REST from Rails This way your can make your system really scalable.

like image 82
Nikolay Shmyrev Avatar answered Sep 24 '22 16:09

Nikolay Shmyrev