I have been trying to do real-time audio signal processing using 'pyAudio' module in python. What I did was a simple case of reading audio data from microphone and play it via headphones. I tried with the following code(both Python and Cython versions). Thought it works but unfortunately it is stalls and not smooth enough. How can I improve the code so that it will run smoothly. My PC is i7, 8GB RAM.
Python Version
import pyaudio
import numpy as np
RATE = 16000
CHUNK = 256
p = pyaudio.PyAudio()
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True,
frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
Cython Version
import pyaudio
import numpy as np
cdef int RATE = 16000
cdef int CHUNK = 1024
cdef int i
p = pyaudio.PyAudio()
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
for i in range(500): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
Python has some great libraries for audio processing like Librosa and PyAudio. There are also built-in modules for some basic audio functionalities. It is a Python module to analyze audio signals in general but geared more towards music. It includes the nuts and bolts to build a MIR(Music information retrieval) system.
Digital Signal Processing (DSP) with Python Programming [Book] Get full access to Digital Signal Processing (DSP) with Python Programming and 60K+ other titles, with free 10-day trial of O'Reilly.
The signal should be normalized in range 0 to 1 by taking abs(signal)/(maximum of datatype), plot the signal from wave lib you will find the difference in values.
The code below will take the default input device, and output what's recorded into the default output device.
import PyAudio
import numpy as np
p = pyaudio.PyAudio()
CHANNELS = 2
RATE = 44100
def callback(in_data, frame_count, time_info, flag):
# using Numpy to convert to array for processing
# audio_data = np.fromstring(in_data, dtype=np.float32)
return in_data, pyaudio.paContinue
stream = p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=RATE,
output=True,
input=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(20)
stream.stop_stream()
print("Stream is stopped")
stream.close()
p.terminate()
This will run for 20 seconds and stop. The method callback is where you can process the signal :
audio_data = np.fromstring(in_data, dtype=np.float32)
return in_data
is where you send back post-processed data to the output device.
Note chunk has a default argument of 1024 as noted in the PyAudio docs: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With