I am coding a basic frequency analisys of WAVE audio files, but I have trouble when it comes to convertion from WAVE frames to integer.
Here is the relevant part of my code:
import wave
track = wave.open('/some_path/my_audio.wav', 'r')
byt_depth = track.getsampwidth() #Byte depth of the file in BYTES
frame_rate = track.getframerate()
buf_size = 512
def byt_sum (word):
#convert a string of n bytes into an int in [0;8**n-1]
return sum( (256**k)*word[k] for k in range(len(word)) )
raw_buf = track.readframes(buf_size)
'''
One frame is a string of n bytes, where n = byt_depth.
For instance, with a 24bits-encoded file, track.readframe(1) could be:
b'\xff\xfe\xfe'.
raw_buf[n] returns an int in [0;255]
'''
sample_buf = [byt_sum(raw_buf[byt_depth*k:byt_depth*(k+1)])
- 2**(8*byt_depth-1) for k in range(buf_size)]
Problem is: when I plot sample_buf
for a single sine signal, I get
an alternative, wrecked sine signal.
I can't figure out why the signal overlaps udpside-down.
Any idea?
P.S.: Since I'm French, my English is quite hesitating. Feel free to edit if there are ugly mistakes.
It might be because you need to use an unsigned value for representing the 16bit samples. See https://en.wikipedia.org/wiki/Pulse-code_modulation
Try to add 32767 to each sample.
Also you should use the python struct module to decode the buffer.
import struct
buff_size = 512
# 'H' is for unsigned 16 bit integer, try 'h' also
sample_buff = struct.unpack('H'*buf_size, raw_buf)
The easiest way is to use a library that does the decoding for you. There are several Python libraries available, my favorite is the soundfile module:
import soundfile as sf
signal, samplerate = sf.read('/some_path/my_audio.wav')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With