Python Speech recognition produces bad results

Tags:

I'm trying to get my speech recognition script working but it can't understand me.

import pyaudio
import speech_recognition as sr

def initSpeech():
    r = sr.Recognizer()

    with sr.Microphone() as source:
        r.adjust_for_ambient_noise(source, duration=2)
        print("Set minimum energy threshold to {}".format(r.energy_threshold))
        print("Say something")

        audio = r.listen(source, phrase_time_limit=10)

        command = ""
        try:
            command = r.recognize_google(audio)
        except:
            print("Coundn't understand you!")

        print(command)

initSpeech()

This is my code to recognize my voice but it always prints out "Coundn't understand you!" when I record my voice using python with the following script and put the wave file as input for the speech recognition it works fine:

import pyaudio
import wave

CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("* recording")

frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    data = stream.read(CHUNK)
    frames.append(data)

print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

This script to record my voice and then using this file "output.wav" as input for the speech recognition.

EDIT:

With,

with open("microphone-results.wav", "wb") as f:
        f.write(audio.get_wav_data())

I recorded my voice which will be analyzed. And it sounded really bad, low and slow like in bad movies with an voice changer. Maybe this is a hint for the solution. I already checked the settings of chuck_size and sample_rate these are identical with the settings in my recording script above. My system: Windows 10

There is also an issue on github github issue 358

Python: 3.6

Thank you for your help!

692

asked May 13 '18 14:05

Tobias Schäfer

1 Answers

Your audio is obviously not recorded properly, and this leads to recognition failure. My guess is that r.adjust_for_ambient_noise is failing you (automatic speech/silence detectors are not simple to implement). Start with removing this line and manually set

r.energy_threshold = 50
r.dynamic_energy_threshold = False

After that, save the recorded audio into .WAV file and listen. You have to get your audio clear before you send it to ASR engine.

Also, I recommend you to make sure that you are using the microphone you intended to use

print(Microphone.list_microphone_names()[0])

answered Oct 26 '22 23:10

igrinis

Related questions
                            
                                In PyTorch how are layer weights and biases initialized by default?
                            
                                How to write a post on facebook using python
                            
                                Regular expressions - different string the same match
                            
                                How to avoid circular imports in a Flask app with Flask SQLAlchemy models?
                            
                                Importing requests into Python using Visual Studio Code
                            
                                Why pandas read_csv issues this warning? (elementwise comparison failed)
                            
                                Is SQL injection protection built into SQLAlchemy's ORM or Core?
                            
                                Install jupyterlab in pip3 throws 'TypeError: expected string or bytes-like object'
                            
                                Convert dict constructor to Pandas MultiIndex dataframe
                            
                                Generalized __eq__() method in Python
                            
                                Check argparse.ArgumentTypeError
                            
                                Serializer validate function is not called DRF
                            
                                Change log-level via mocking
                            
                                How to convert the depth map to 3D point clouds?
                            
                                Reproducing deadlock while using Popen.wait()
                            
                                Where is this warning being raised 'QApplication: invalid style override passed, ignoring it.'?
                            
                                Django JSONField filtering Queryset
                            
                                Python: Hello world with Flask gives me an error related to app.run(debug=True) [duplicate]
                            
                                How to use Vectorization with NumPy arrays to calculate geodesic distance using Geopy library for a large dataset?
                            
                                How to install python packages in a Google Dataproc cluster

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Speech recognition produces bad results

Tags:

python

python-3.6

windows-10

speech-recognition

Tobias Schäfer

People also ask

1 Answers

igrinis

Recent Activity

Donate For Us