The Python audioop documentation states that most of the available functions require "sound fragments."
The audioop module contains some useful operations on sound fragments. It operates on sound fragments consisting of signed integer samples 8, 16 or 32 bits wide, stored in Python strings.
What exactly is a sound fragment and how can I turn an existing .wav file into one?
Thanks.
A sound fragment represents a sequence of signed integer sound samples encoded in a bytes-like object. audioop
supports representations of 1, 2, 3 or 4 bytes per sample.
A single sample can be converted with struct.pack
(let's use 0, 1, 2, -1, 42 as examples):
from struct import pack
for sample in [0, 1, 2, -1, 42]:
print(f'sample value {sample}, 1 byte/sample:', pack('b', sample))
print(f' {sample}, 2 byte/sample:', pack('h', sample))
print(f' {sample}, 4 byte/sample:', pack('i', sample))
This prints:
sample value 0, 1 byte/sample: b'\x00'
0, 2 byte/sample: b'\x00\x00'
0, 4 byte/sample: b'\x00\x00\x00\x00'
sample value 1, 1 byte/sample: b'\x01'
1, 2 byte/sample: b'\x01\x00'
1, 4 byte/sample: b'\x01\x00\x00\x00'
sample value 2, 1 byte/sample: b'\x02'
2, 2 byte/sample: b'\x02\x00'
2, 4 byte/sample: b'\x02\x00\x00\x00'
sample value -1, 1 byte/sample: b'\xff'
-1, 2 byte/sample: b'\xff\xff'
-1, 4 byte/sample: b'\xff\xff\xff\xff'
sample value 42, 1 byte/sample: b'*'
42, 2 byte/sample: b'*\x00'
42, 4 byte/sample: b'*\x00\x00\x00'
Let's assume we want to convert some sound samples we have in Python (signed) integer representation to a sound fragment using 2 byte per sample (allowing input sample values between -32768 to 32767; that is -2**15
to 2**15-1
) like used in an audio CD:
import audioop
import array
samples = [0, 1000, 32767, 1, -1, -32768, -1000] # 7 samples of "music"
fragment = array.array('h', samples).tobytes()
print(f'Fragment {fragment} of length {len(fragment)}')
# convert back with audioop function
print([audioop.getsample(fragment, 2, i) for i in range(len(fragment) // 2)])
This prints:
Fragment b'\x00\x00\xe8\x03\xff\x7f\x01\x00\xff\xff\x00\x80\x18\xfc' of length 14
[0, 1000, 32767, 1, -1, -32768, -1000]
As last example, write a 3 second stereo sine wave as .wav
file and read it again:
import audioop
import wave
from array import array
from math import sin, pi
bytes_per_sample = 2
duration = 3. # seconds
sample_rate = 16000. # Hz
frequency = 440. # Hz
max_amplitude = 2**(bytes_per_sample * 8 - 1) - 1
amp = max_amplitude * 0.8
time = [i / sample_rate for i in range(int(sample_rate * duration))]
samples = [int(round(amp * sin(2 * pi * frequency * t))) for t in time]
fragment_mono = array('h', samples).tobytes()
fragment_stereo = audioop.tostereo(fragment_mono, bytes_per_sample, 1, 1)
with wave.open('sine_440hz_stereo.wav', 'wb') as wav:
wav.setnchannels(2) # stereo
wav.setsampwidth(bytes_per_sample)
wav.setframerate(sample_rate)
wav.writeframes(fragment_stereo)
# read wave file again
with wave.open('sine_440hz_stereo.wav', 'rb') as wav:
fragment = wav.readframes(wav.getnframes())
# test whether written fragment and read fragment are same
assert fragment == fragment_stereo
import audioop
import wave
read = wave.open("C:/Users/Pratik/sampy/cat.wav")
string_wav = read.readframes(read.getnframes())
a = audioop.lin2alaw(string_wav,read.getsampwidth())
print(a)
----------------------------------------------------------
how to convert wav file to alaw
You can do so by using the wave module
The open()
method opens the file and readframes(n)
returns (maximum) n frames of audio as a string of bytes, just what audioop wants.
For example, let's say you need to use the avg()
method from audioop. This is how you could do it:
import wave
import audioop
wav = wave.open("piano2.wav")
print(audioop.avg(wav.readframes(wav.getnframes()), wav.getsampwidth()))
Outputs:
-2
Also, you may be interested in the rewind()
method from the wave module. It puts the reading position back to the beginning of the wav file.
If you need to read through your wav file twice you can write this:
wav = wave.open("piano2.wav")
print(audioop.avg(wav.readframes(wav.getnframes()), wav.getsampwidth()))
# if you don't call rewind, next readframes() call
# will return nothing and audioop will fail
wav.rewind()
print(audioop.max(wav.readframes(wav.getnframes()), wav.getsampwidth()))
Or alternatively you can cache the string:
wav = wave.open("piano2.wav")
string_wav = wav.readframes(wav.getnframes())
print(audioop.avg(string_wav, wav.getsampwidth()))
# wav.rewind()
print(audioop.max(string_wav, wav.getsampwidth()))
You may want to look into the wave
module. You probably want to open a file in read mode and use readframes
to get the sample you need for audiooop.
To answer what exactly a fragment is, it's a bytes object, which is just a string of bytes. I believe that for 8-bit audio files, there would be one byte for each frame for 8-bit audio, two bytes per frame for 16-bit audio, and four bytes for 32-bit audio.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With