After going through the documentation of pyaudio and reading some other articles on the web, I am confused if my understanding is correct.
This is the code for audio recording found on pyaudio's site:
import pyaudio import wave CHUNK = 1024 FORMAT = pyaudio.paInt16 CHANNELS = 2 RATE = 44100 RECORD_SECONDS = 5 WAVE_OUTPUT_FILENAME = "output.wav" p = pyaudio.PyAudio() stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK) print("* recording") frames = [] for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)): data = stream.read(CHUNK) frames.append(data) print("* done recording") stream.stop_stream() stream.close() p.terminate()
and if I add these lines then I am able to play whatever I recorded:
play=pyaudio.PyAudio() stream_play=play.open(format=FORMAT, channels=CHANNELS, rate=RATE, output=True) for data in frames: stream_play.write(data) stream_play.stop_stream() stream_play.close() play.terminate()
pyaudio.get_sample_size(pyaudio.paInt16)
.frames[0]
must be 4096 bytes. However, sys.getsizeof(frames[0])
returns 4133
, but len(frames[0])
returns 4096
.for
loop executes int(RATE / CHUNK * RECORD_SECONDS)
times, I cant understand why. Here is the same question answered by "Ruben Sanchez" but I cant be sure if its correct as he says CHUNK=bytes
. And according to his explanation, it must be int(RATE / (CHUNK*2) * RECORD_SECONDS)
as (CHUNK*2)
is the number of samples read in buffer with each iteration.print frames[0]
, it prints gibberish as it tries to treat the string to be ASCII encoded which it is not, it is just a stream of bytes. So how do I print this stream of bytes in hexadecimal using struct
module? And if later, I change each of the hexadecimal value with values of my choice, will it still produce a playable sound?Whatever I wrote above was my understanding of the things and many of them maybe wrong.
read() blocks until all the given/requested frames have been played/recorded. Alternatively, to generate audio data on the fly or immediately process recorded audio data, use the “callback mode” outlined below.
In callback mode, PyAudio will call a specified callback function (2) whenever it needs new audio data (to play) and/or when there is new (recorded) audio data available. Note that PyAudio calls the callback function in a separate thread.
PyAudio provides Python bindings for PortAudio v19, the cross-platform audio I/O library. With PyAudio, you can easily use Python to play and record audio on a variety of platforms, such as GNU/Linux, Microsoft Windows, and Apple macOS.
sys.getsizeof()
reports the storage space needed by the Python interpreter, which is typically a bit more than the actual size of the raw data.RATE * RECORD_SECONDS
is the number of frames that should be recorded. Since the for
loop is not repeated for each frame but only for each chunk, the number of loops has to be divided by the chunk size CHUNK
. This has nothing to do with samples, so there is no factor of 2
involved.[hex(x) for x in frames[0]]
. If you want to get the actual 2-byte numbers use the format string '<H'
with the struct
module.You might be interested in my tutorial about reading WAV files with the wave
module, which covers some of your questions in more detail: http://nbviewer.jupyter.org/github/mgeier/python-audio/blob/master/audio-files/audio-files-with-wave.ipynb
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With