Determining Bit-Depth of a wav file

Tags:

I am looking for a fast, preferably standard library mechanism to determine the bit-depth of wav file e.g. '16-bit' or '24-bit'.

I am using a subprocess call to Sox to get a plethora of audio metadata but a subprocess call is very slow and the only information I can only currently get reliably from Sox is the bit-depth.

The built in wave module does not have a function like "getbitdepth()" and is also not compatible with 24bit wav files - I could use a 'try except' to access the files metadata using the wave module (if it works, manually record that it is 16bit) then on except call sox instead (where sox will perform the analysis to accurately record its bitdepth). My concern is that that this approach feels like guess work. What if a an 8bit file is read? I would be manually assigning 16-bit when it is not.

SciPy.io.wavefile also is not compatible with 24bit audio so creates a similar issue.

This tutorial is really interesting and even includes some really low level (low level for Python at least) scripting examples to extract information from the wav files headers - unfortunately these scripts don't work for 16-bit audio.

Is there any way to simply (and without calling sox) determine what bit-depth the wav file I'm checking has?

The wave header parser script I'm using is as follows:

Click to copy

import struct
import os

def print_wave_header(f):
    '''
    Function takes an audio file path as a parameter and 
    returns a dictionary of metadata parsed from the header
    '''
    r = {} #the results of the header parse
    r['path'] = f
    fin = open(f,"rb") # Read wav file, "r flag" - read, "b flag" - binary 
    ChunkID=fin.read(4) # First four bytes are ChunkID which must be "RIFF" in ASCII
    r["ChunkID"]=ChunkID
    ChunkSizeString=fin.read(4) # Total Size of File in Bytes - 8 Bytes
    ChunkSize=struct.unpack('I',ChunkSizeString) # 'I' Format is to to treat the 4 bytes as unsigned 32-bit inter
    TotalSize=ChunkSize[0]+8 # The subscript is used because struct unpack returns everything as tuple
    r["TotalSize"]=TotalSize
    DataSize=TotalSize-44 # This is the number of bytes of data
    r["DataSize"]=DataSize
    Format=fin.read(4) # "WAVE" in ASCII
    r["Format"]=Format
    SubChunk1ID=fin.read(4) # "fmt " in ASCII
    r["SubChunk1ID"]=SubChunk1ID
    SubChunk1SizeString=fin.read(4) # Should be 16 (PCM, Pulse Code Modulation)
    SubChunk1Size=struct.unpack("I",SubChunk1SizeString) # 'I' format to treat as unsigned 32-bit integer
    r["SubChunk1Size"]=SubChunk1Size
    AudioFormatString=fin.read(2) # Should be 1 (PCM)
    AudioFormat=struct.unpack("H",AudioFormatString) ## 'H' format to treat as unsigned 16-bit integer
    r["AudioFormat"]=AudioFormat[0]
    NumChannelsString=fin.read(2) # Should be 1 for mono, 2 for stereo
    NumChannels=struct.unpack("H",NumChannelsString) # 'H' unsigned 16-bit integer
    r["NumChannels"]=NumChannels[0]
    SampleRateString=fin.read(4) # Should be 44100 (CD sampling rate)
    SampleRate=struct.unpack("I",SampleRateString)
    r["SampleRate"]=SampleRate[0]
    ByteRateString=fin.read(4) # 44100*NumChan*2 (88200 - Mono, 176400 - Stereo)
    ByteRate=struct.unpack("I",ByteRateString) # 'I' unsigned 32 bit integer
    r["ByteRate"]=ByteRate[0]
    BlockAlignString=fin.read(2) # NumChan*2 (2 - Mono, 4 - Stereo)
    BlockAlign=struct.unpack("H",BlockAlignString) # 'H' unsigned 16-bit integer
    r["BlockAlign"]=BlockAlign[0]
    BitsPerSampleString=fin.read(2) # 16 (CD has 16-bits per sample for each channel)
    BitsPerSample=struct.unpack("H",BitsPerSampleString) # 'H' unsigned 16-bit integer
    r["BitsPerSample"]=BitsPerSample[0]
    SubChunk2ID=fin.read(4) # "data" in ASCII
    r["SubChunk2ID"]=SubChunk2ID
    SubChunk2SizeString=fin.read(4) # Number of Data Bytes, Same as DataSize
    SubChunk2Size=struct.unpack("I",SubChunk2SizeString)
    r["SubChunk2Size"]=SubChunk2Size[0]
    S1String=fin.read(2) # Read first data, number between -32768 and 32767
    S1=struct.unpack("h",S1String)
    r["S1"]=S1[0]
    S2String=fin.read(2) # Read second data, number between -32768 and 32767
    S2=struct.unpack("h",S2String)
    r["S2"]=S2[0]
    S3String=fin.read(2) # Read second data, number between -32768 and 32767
    S3=struct.unpack("h",S3String)
    r["S3"]=S3[0]
    S4String=fin.read(2) # Read second data, number between -32768 and 32767
    S4=struct.unpack("h",S4String)
    r["S4"]=S4[0]
    S5String=fin.read(2) # Read second data, number between -32768 and 32767
    S5=struct.unpack("h",S5String)
    r["S5"]=S5[0]
    fin.close()
    return r

245

asked Sep 13 '17 17:09

user3535074

2 Answers

Esentially the same answer as from Matthias, but with copy-pastable code.

Requirements

Click to copy

pip install soundfile

Code

Click to copy

import soundfile as sf

ob = sf.SoundFile('example.wav')
print('Sample rate: {}'.format(ob.samplerate))
print('Channels: {}'.format(ob.channels))
print('Subtype: {}'.format(ob.subtype))

Explanation

Channels: Usually 2, meaning you have one left speaker and one right speaker.
Sample rate: Audio signals are analog, but we want to represent them digitally. Meaning we want to discretize them in value and in time. The sample rate gives how many times per second we get a value. The unit is Hz. The sample rate needs to be at least double of the highest frequency in the original sound, otherwise you get aliasing. Human hearing range goes from ~20Hz to ~20kHz, so you can cut off anything above 20kHZ. Meaning a sample rate of more than 40kHz does not make much sense.
Bit-depth: The higher the bit-depth, the more dynamic range can be captured. Dynamic range is the difference between the quietest and loudest volume of an instrument, part or piece of music. A typical value seems to be 16 bit or 24 bit. A bit-depth of 16 bit has a theoretical dynamic range of 96 dB, whereas 24 bit has a dynamic range of 144 dB (source).
Subtype: PCM_16 means 16 bit depth, where PCM stands for Pulse-Code Modulation.

Alternative

If you only look for a command line tool, then I can recommend MediaInfo:

Click to copy

$ mediainfo example.wav
General
Complete name                            : example.wav
Format                                   : Wave
File size                                : 83.2 MiB
Duration                                 : 8 min 14 s
Overall bit rate mode                    : Constant
Overall bit rate                         : 1 411 kb/s

Audio
Format                                   : PCM
Format settings                          : Little / Signed
Codec ID                                 : 1
Duration                                 : 8 min 14 s
Bit rate mode                            : Constant
Bit rate                                 : 1 411.2 kb/s
Channel(s)                               : 2 channels
Sampling rate                            : 44.1 kHz
Bit depth                                : 16 bits
Stream size                              : 83.2 MiB (100%)

142

answered Oct 17 '22 23:10

Martin Thoma

I highly recommend the soundfile module (but mind you, I'm very biased because I wrote a large part of it).

There you can open your file as a soundfile.SoundFile object, which has a subtype attribute that holds the information you are looking for.

In your case that would probably be 'PCM_16' or 'PCM_24'.

answered Oct 17 '22 23:10

Matthias

Related questions
                            
                                Proper sqlalchemy use in flask
                            
                                Popping items from a list using a loop
                            
                                Do I need to make a migration if I change null and blank values on a Model field?
                            
                                Can't create new event loop after calling loop.close asyncio.get_event_loop in Python3.6.1
                            
                                Format the color of a cell in a pandas dataframe according to multiple conditions
                            
                                Pythonic way to count occurrences from a list in a string
                            
                                How do i move the offset of the 'index' method of 'list'
                            
                                replacing list items that fit condition in lambda
                            
                                How to make add replies to comments in Django?
                            
                                Python function calling order
                            
                                Access Class method and variable using self
                            
                                Dictionary integer as key and function as value?
                            
                                Groupby and count the number of unique values (Pandas)
                            
                                Django REST Framework - Class UserSerializer missing "Meta.model" attribute [closed]
                            
                                Stack a square DataFrame to only keep the upper/lower triangle
                            
                                'float' object has no attribute 'strip'
                            
                                Insert a column to a pandas dataframe
                            
                                ggplot in python: plot size and color
                            
                                Pandas: sum values from column to unique values
                            
                                Flask-Login documentation: LoginForm()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Determining Bit-Depth of a wav file

Tags:

python

audio

wave

sox

bit-depth

user3535074

People also ask

2 Answers

Requirements

Code

Explanation

Alternative

Martin Thoma

Matthias

Recent Activity

Donate For Us