Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Speech Compare

I have two .wav files that I need to compare and decide if they contain the same words (same order too).

I have been searching for the best method for a while now. I can't figure out how to have pyspeech use a file as input. I've tried getting the CMU sphinx project working but I cant seem to get GStreamer to work with Python 27 let alone their project. I've messed around with DragonFly as well with no luck.

I am using Win7 64bit with Python27. Does anyone have any ideas?

Any help is greatly appreciated.

like image 649
Kreuzade Avatar asked Feb 22 '12 22:02

Kreuzade


People also ask

What is the best speech recognition library Python?

1. DeepSpeech. One of the best open-source speech-to-text recognition is Deepspeech it can run in real-time using a pre-trained machine learning model which is based on Baidu's Deep Speech research paper and is implemented using Tensorflow. It also has the highest ratings on GitHub with 18.6k stars.

How does Python detect voice?

Recognition of Spoken WordsPyaudio − It can be installed by using pip install Pyaudio command. SpeechRecognition − This package can be installed by using pip install SpeechRecognition. Google-Speech-API − It can be installed by using the command pip install google-api-python-client.

How do I get text to speech in Python?

Translation of Speech to Text: First, we need to import the library and then initialize it using init() function. This function may take 2 arguments. After initialization, we will make the program speak the text using say() function. This method may also take 2 arguments.

How do I install SpeechRecognition?

First, make sure you have all the requirements listed in the “Requirements” section. The easiest way to install this is using pip install SpeechRecognition. Otherwise, download the source distribution from PyPI, and extract the archive. In the folder, run python setup.py install.


1 Answers

You could try PySpeech. For some more info see pyspeech (python) - Transcribe mp3 files?. I have never used this, but I believe it leverages the built in speech recognition engine of Windows. This will let you convert the Wav files to text and then you can do a text compare.

To use the Windows speech engine and use a wav file for input there are two requirements.

  1. Use an inproc recognizer (SpeechRecognitionEngine). Shared recognizers cannot use Wav files as input.
  2. On the recognizer object call SetInputToWaveFile to specify your input wav file.

You may have to resample the wav files because the speech recognition engines only support certain sample rates.

  • 8 bits per sample
  • single channel mono
  • 22,050 samples per second
  • PCM encoding

works well on Windows. See https://stackoverflow.com/a/6203533/90236 for some more info.

For some more background on the windows speech engines, you might take a look at SAPI and Windows 7 Problem and What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?

like image 163
Michael Levy Avatar answered Oct 22 '22 13:10

Michael Levy