I am writing a python function to return the loudness of a .wav file. RMS seems to be best the metric for this, Detect and record a sound with python.
audioop.rms()
does the trick, but I'd like to avoid audioop as a dependency, and I already import numpy. but I'm not getting the same RMS values, and would appreciate help in understanding what is going on.
From the audioop page, it says that the rms calculation is just what you'd expect, namely sqrt(sum(S_i^2)/n)
, where, S_i
is the i
-th sample of the sound. Seems like its not rocket science.
To use numpy, I first convert the sound to a numpy array, and always see identical min / max, and the same length of the data (so the conversion seems fine).
>>> d = np.frombuffer(data, np.int16)
>>> print (min(d), max(d)), audioop.minmax(data,2)
(-2593, 2749) (-2593, 2749)
but I get very different RMS values, not even ball-park close:
>>> numpy_rms = np.sqrt(sum(d*d)/len(d))
>>> print numpy_rms, audioop.rms(data, 2)
41.708703254716383, 120
The difference between them is variable, no obvious pattern I can see, eg, I also get:
63.786714248938772, 402
62.779300661773405, 148
My numpy RMS code gives identical output to the one here: Numpy Root-Mean-Squared (RMS) smoothing of a signal
I don't see where I am going wrong, but something is off. Any help much appreciated.
EDITED / UPDATE:
In case its useful, here's the code I ended up with. Its not quite as fast as audioop but is still plenty fast, good enough for my purpose. Of note, using np.mean() makes it MUCH faster (~100x) than my version using python sum().
def np_audioop_rms(data, width):
"""audioop.rms() using numpy; avoids another dependency for app"""
#_checkParameters(data, width)
if len(data) == 0: return None
fromType = (np.int8, np.int16, np.int32)[width//2]
d = np.frombuffer(data, fromType).astype(np.float)
rms = np.sqrt( np.mean(d**2) )
return int( rms )
Perform calculations using double
as in audioop.rms()
code:
d = np.frombuffer(data, np.int16).astype(np.float)
>>> import audioop, numpy as np
>>> data = 'abcdefgh'
>>> audioop.rms(data, 2)
25962
>>> d = np.frombuffer(data, np.int16)
>>> np.sqrt((d*d).sum()/(1.*len(d)))
80.131142510262507
>>> d = np.frombuffer(data, np.int16).astype(np.float)
>>> np.sqrt((d*d).sum()/len(d))
25962.360851817772
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With