What do the bytes in a .wav file represent?

3 Answers

You will have heard, that audio signals are represented by some kind of wave. If you have ever seen this wave diagrams with a line going up and down -- that's basically what's inside those files. Take a look at this file picture from http://en.wikipedia.org/wiki/Sampling_rate

sampling

You see your audio wave (the gray line). The current value of that wave is repeatedly measured and given as a number. That's the numbers in those bytes. There are two different things that can be adjusted with this: The number of measurements you take per second (that's the sampling rate, given in Hz -- that's how many per second you grab). The other adjustment is how exact you measure. In the 2-byte case, you take two bytes for one measurement (that's values from -32768 to 32767 normally). So with those numbers given there, you can recreate the original wave (up to a limited quality, of course, but that's always so when storing stuff digitally). And recreating the original wave is what your speaker is trying to do on playback.

There are some more things you need to know. First, since it's two bytes, you need to know the byte order (big endian, little endian) to recreate the numbers correctly. Second, you need to know how many channels you have, and how they are stored. Typically you would have mono (one channel) or stereo (two), but more is possible. If you have more than one channel, you need to know, how they are stored. Often you would have them interleaved, that means you get one value for each channel for every point in time, and after that all values for the next point in time.

To illustrate: If you have data of 8 bytes for two channels and 16-bit number:

abcdefgh

Here a and b would make up the first 16bit number that's the first value for channel 1, c and d would be the first number for channel 2. e and f are the second value of channel 1, g and h the second value for channel 2. You wouldn't hear much there because that would not come close to a second of data...

If you take together all that information you have, you can calculate the bit rate you have, that's how many bits of information is generated by the recorder per second. In our example, you generate 2 bytes per channel on every sample. With two channels, that would be 4 bytes. You need about 44000 samples per second to represent the sounds a human beeing can normally hear. So you'll end up with 176000 bytes per second, which is 1408000 bits per second.

And of course, it is not 2-bit values, but two 2 byte values there, or you would have a really bad quality.

answered Oct 17 '22 00:10

kratenko

The first 44 bytes are commonly a standard RIFF header, as described here: http://tiny.systems/software/soundProgrammer/WavFormatDocs.pdf and here: http://www.topherlee.com/software/pcm-tut-wavformat.html

Apple/OSX/macOS/iOS created .wav files might add an 'FLLR' padding chunk to the header and thus increase the size of the initial header RIFF from 44 bytes to 4k bytes (perhaps for better disk or storage block alignment of the raw sample data).

The rest is very often 16-bit linear PCM in signed 2's-complement little-endian format, representing arbitrarily scaled samples at a rate of 44100 Hz.

Wave File Format

answered Oct 17 '22 01:10

hotpaw2

The WAVE (.wav) file contain a header, which indicates the formatting information of the audio file's data. Following the header is the actual audio raw data. You can check their exact meaning below.

Positions  Typical Value Description

1 - 4      "RIFF"        Marks the file as a RIFF multimedia file.
                         Characters are each 1 byte long.

5 - 8      (integer)     The overall file size in bytes (32-bit integer)
                         minus 8 bytes. Typically, you'd fill this in after
                         file creation is complete.

9 - 12     "WAVE"        RIFF file format header. For our purposes, it
                         always equals "WAVE".

13-16      "fmt "        Format sub-chunk marker. Includes trailing null.

17-20      16            Length of the rest of the format sub-chunk below.

21-22      1             Audio format code, a 2 byte (16 bit) integer. 
                         1 = PCM (pulse code modulation).

23-24      2             Number of channels as a 2 byte (16 bit) integer.
                         1 = mono, 2 = stereo, etc.

25-28      44100         Sample rate as a 4 byte (32 bit) integer. Common
                         values are 44100 (CD), 48000 (DAT). Sample rate =
                         number of samples per second, or Hertz.

29-32      176400        (SampleRate * BitsPerSample * Channels) / 8
                         This is the Byte rate.

33-34      4             (BitsPerSample * Channels) / 8
                         1 = 8 bit mono, 2 = 8 bit stereo or 16 bit mono, 4
                         = 16 bit stereo.

35-36      16            Bits per sample. 

37-40      "data"        Data sub-chunk header. Marks the beginning of the
                         raw data section.

41-44      (integer)     The number of bytes of the data section below this
                         point. Also equal to (#ofSamples * #ofChannels *
                         BitsPerSample) / 8

45+                      The raw audio data.

I copied all of these from http://www.topherlee.com/software/pcm-tut-wavformat.html here

answered Oct 17 '22 01:10

seanxiaoxiao

Related questions
                            
                                Mac OS X virtual audio driver
                            
                                How do you trim the audio file's end using SoX?
                            
                                Low-latency audio playback on Android
                            
                                Android - MediaPlayer Buffer Size in ICS 4.0
                            
                                Android: Need to record mic input
                            
                                Generating sine wave sound in Python
                            
                                gapless looping audio html5
                            
                                AVPlayer stops playing and doesn't resume again
                            
                                Beats per minute from real-time audio input
                            
                                How do some apps overcome phone recording restrictions?
                            
                                Any good recommendations for MP3/Sound libraries for java? [closed]
                            
                                Routing audio to Bluetooth Headset (non-A2DP) on Android
                            
                                synchronizing audio over a network
                            
                                Reading input sound signal using Python
                            
                                Download the best quality audio file with youtube-dl [closed]
                            
                                How do I convert speech to text?
                            
                                Sound generation / synthesis with python?
                            
                                Can ffmpeg convert audio from raw PCM to WAV?
                            
                                How can I tell when an HTML5 audio element has finished playing?
                            
                                Get length of .wav from sox output

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What do the bytes in a .wav file represent?

Tags:

audio

wav

user1330691

People also ask

3 Answers

kratenko

hotpaw2

seanxiaoxiao

Recent Activity

Donate For Us