I'm trying to read the data from a .wav file. <pre class="prettyprint"><code>import wave wr = wave.open("~/01 Road.wav", 'r') # sample width is 2 bytes # number of channels is 2 wave_data = wr.readframes(1) print(wave_data) </code></pre> This gives: <pre class="prettyprint"><code>b'\x00\x00\x00\x00' </code></pre> Which is the "first frame" of the song. These 4 bytes obviously correspond to the (2 channels * 2 byte sample width) bytes per frame, but what does each byte correspond to? In particular, I'm trying to convert it to a mono amplitude signal.

If you want to understand what the 'frame' is you will have to read the standard of the wave file format. For instance: https://web.archive.org/web/20140221054954/http://home.roadrunner.com/~jgglatt/tech/wave.htm From that document: <blockquote> The sample points that are meant to be "played" ie, sent to a Digital to Analog Converter(DAC) simultaneously are collectively called a sample frame. In the example of our stereo waveform, every two sample points makes up another sample frame. This is illustrated below for that stereo example. </blockquote> <pre class="prettyprint"><code>sample sample sample frame 0 frame 1 frame N _____ _____ _____ _____ _____ _____ | ch1 | ch2 | ch1 | ch2 | . . . | ch1 | ch2 | |_____|_____|_____|_____| |_____|_____| _____ | | = one sample point |_____| </code></pre> To convert to mono you could do something like this, <pre class="prettyprint"><code>import wave def stereo_to_mono(hex1, hex2): """average two hex string samples""" return hex((ord(hex1) + ord(hex2))/2) wr = wave.open('piano2.wav','r') nchannels, sampwidth, framerate, nframes, comptype, compname = wr.getparams() ww = wave.open('piano_mono.wav','wb') ww.setparams((1,sampwidth,framerate,nframes,comptype,compname)) frames = wr.readframes(wr.getnframes()-1) new_frames = '' for (s1, s2) in zip(frames[0::2],frames[1::2]): new_frames += stereo_to_mono(s1,s2)[2:].zfill(2).decode('hex') ww.writeframes(new_frames) </code></pre> There is no clear-cut way to go from stereo to mono. You could just drop one channel. Above, I am averaging the channels. It all depends on your application.

As a direct answer to your question: two bytes make one 16-bit integer value in the "usual" way, given by the explicit formula: <code>value = ord(data[0]) + 256 * ord(data[1])</code>. But using the <code>struct</code> module is a better way to decode (and later reencode) such multibyte integers: <pre class="prettyprint"><code>import struct print(struct.unpack("HH", b"\x00\x00\x00\x00")) # -> gives a 2-tuple of integers, here (0, 0) </code></pre> or, if we want a signed 16-bit integer (which I think is the case in .wav files), use <code>"hh"</code> instead of <code>"HH"</code>. (I leave to you the task of figuring out how exactly two bytes can encode an integer value from -32768 to 32767 :-)

Python Wave byte data

Tags:

python

audio

wave

I'm trying to read the data from a .wav file.

import wave
wr = wave.open("~/01 Road.wav", 'r')
# sample width is 2 bytes
# number of channels is 2
wave_data = wr.readframes(1)
print(wave_data)

This gives:

b'\x00\x00\x00\x00'

Which is the "first frame" of the song. These 4 bytes obviously correspond to the (2 channels * 2 byte sample width) bytes per frame, but what does each byte correspond to?

In particular, I'm trying to convert it to a mono amplitude signal.

813

asked Dec 19 '13 09:12

jameh

3 Answers

If you want to understand what the 'frame' is you will have to read the standard of the wave file format. For instance: https://web.archive.org/web/20140221054954/http://home.roadrunner.com/~jgglatt/tech/wave.htm

From that document:

The sample points that are meant to be "played" ie, sent to a Digital to Analog Converter(DAC) simultaneously are collectively called a sample frame. In the example of our stereo waveform, every two sample points makes up another sample frame. This is illustrated below for that stereo example.

sample       sample              sample
frame 0      frame 1             frame N
 _____ _____ _____ _____         _____ _____
| ch1 | ch2 | ch1 | ch2 | . . . | ch1 | ch2 |
|_____|_____|_____|_____|       |_____|_____|
 _____
|     | = one sample point
|_____|

To convert to mono you could do something like this,

import wave

def stereo_to_mono(hex1, hex2):
    """average two hex string samples"""
    return hex((ord(hex1) + ord(hex2))/2)

wr = wave.open('piano2.wav','r')

nchannels, sampwidth, framerate, nframes, comptype, compname =  wr.getparams()

ww = wave.open('piano_mono.wav','wb')
ww.setparams((1,sampwidth,framerate,nframes,comptype,compname))

frames = wr.readframes(wr.getnframes()-1)

new_frames = ''

for (s1, s2) in zip(frames[0::2],frames[1::2]):
    new_frames += stereo_to_mono(s1,s2)[2:].zfill(2).decode('hex')

ww.writeframes(new_frames)

There is no clear-cut way to go from stereo to mono. You could just drop one channel. Above, I am averaging the channels. It all depends on your application.

111

answered Sep 21 '22 06:09

William Denman

For wav file IO I prefer to use scipy. It is perhaps overkill for reading a wav file, but generally after reading the wav it is easier to do downstream processing.

import scipy.io.wavfile
fs1, y1 = scipy.io.wavfile.read(filename)

From here the data y1, will be N samples long, and will have Z columns where each column corresponds to a channel. To convert to a mono wav file you don't say how you'd like to do that conversion. You can take the average, or whatever else you'd like. For average use

monoChannel = y1.mean(axis=1)

answered Sep 20 '22 06:09

Paul

As a direct answer to your question: two bytes make one 16-bit integer value in the "usual" way, given by the explicit formula: value = ord(data[0]) + 256 * ord(data[1]). But using the struct module is a better way to decode (and later reencode) such multibyte integers:

import struct
print(struct.unpack("HH", b"\x00\x00\x00\x00"))
# -> gives a 2-tuple of integers, here (0, 0)

or, if we want a signed 16-bit integer (which I think is the case in .wav files), use "hh" instead of "HH". (I leave to you the task of figuring out how exactly two bytes can encode an integer value from -32768 to 32767 :-)

answered Sep 21 '22 06:09

Armin Rigo

Related questions
                            
                                Excluding primary key in Django dumpdata with natural keys
                            
                                Calculating SimRank using NetworkX?
                            
                                SQLAlchemy: how to filter on PgArray column types?
                            
                                Is mixing Clojure with Python a good idea?
                            
                                Python IMAP Search from or to designated email address
                            
                                How to show continuous real time updates like facebook ticker, meetup.com home page does?
                            
                                Can I add an operation to a list in Python?
                            
                                Python Packaging
                            
                                How to add extra object to tasty pie return json in python django
                            
                                double for loops in python
                            
                                How to create sqlalchemy to json
                            
                                How can I get PyCharm to recognize the static files?
                            
                                Django: How to disable ordering in model
                            
                                Python lambda function to calculate factorial of a number
                            
                                How to save application settings in a config file?
                            
                                Python: Can dumpdata cannot loaddata back. UnicodeDecodeError
                            
                                remove elements in one list present in another list [duplicate]
                            
                                How to strip newlines from each line during a file read? [duplicate]
                            
                                Django compatible web hosting services [closed]
                            
                                How does os.path.join() work?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With