Normalizing FFT spectrum magnitude to 0dB

Tags:

I'm using FFT to extract the amplitude of each frequency components from an audio file. Actually, there is already a function called Plot Spectrum in Audacity that can help to solve the problem. Taking this example audio file which is composed of 3kHz sine and 6kHz sine, the spectrum result is like the following picture. You can see peaks are at 3KHz and 6kHz, no extra frequency.

enter image description here

Now I need to implement the same function and plot the similar result in Python. I'm close to the Audacity result with the help of rfft but I still have problems to solve after getting this result.

enter image description here

What's physical meaning of the amplitude in the second picture?
How to normalize the amplitude to 0dB like the one in Audacity?
Why do the frequency over 6kHz have such high amplitude (≥90)? Can I scale those frequency to relative low level?

Related code:

import numpy as np
from pylab import plot, show
from scipy.io import wavfile

sample_rate, x = wavfile.read('sine3k6k.wav')
fs = 44100.0

rfft = np.abs(np.fft.rfft(x))
p = 20*np.log10(rfft)
f = np.linspace(0, fs/2, len(p))

plot(f, p)
show()

Update

I multiplied Hanning window with the whole length signal (is that correct?) and get this. Most of the amplitude of skirts are below 40.

enter image description here

And scale the y-axis to decibel as @Mateen Ulhaq said. The result is more close to the Audacity one. Can I treat the amplitude below -90dB so low that it can be ignored?

Updated code:

fs, x = wavfile.read('input/sine3k6k.wav')
x = x * np.hanning(len(x))

rfft = np.abs(np.fft.rfft(x))
rfft_max = max(rfft)
p = 20*np.log10(rfft/rfft_max)
f = np.linspace(0, fs/2, len(p))

enter image description here

About the bounty

With the code in the update above, I can measure the frequency components in decibel. The highest possible value will be 0dB. But the method only works for a specific audio file because it uses rfft_max of this audio. I want to measure the frequency components of multiple audio files in one standard rule just like Audacity does.

I also started a discussion in Audacity forum, but I was still not clear how to implement my purpose.

983

asked Jun 27 '18 07:06

WangYudong

1 Answers

After doing some reverse engineering on Audacity source code here some answers. First, they use Welch algorithm for estimating PSD. In short, it splits signal to overlapped segments, apply some window function, applies FFT and averages the result. Mostly as This helps to get better results when noise is present. Anyway, after extracting the necessary parameters here is the solution that approximates Audacity's spectrogram:

import numpy as np
from scipy.io import wavfile
from scipy import signal
from matplotlib import pyplot as plt

segment_size = 512

fs, x = wavfile.read('sine3k6k.wav')
x = x / 32768.0  # scale signal to [-1.0 .. 1.0]

noverlap = segment_size / 2
f, Pxx = signal.welch(x,                        # signal
                      fs=fs,                    # sample rate
                      nperseg=segment_size,     # segment size
                      window='hanning',         # window type to use
                      nfft=segment_size,        # num. of samples in FFT
                      detrend=False,            # remove DC part
                      scaling='spectrum',       # return power spectrum [V^2]
                      noverlap=noverlap)        # overlap between segments

# set 0 dB to energy of sine wave with maximum amplitude
ref = (1/np.sqrt(2)**2)   # simply 0.5 ;)
p = 10 * np.log10(Pxx/ref)

fill_to = -150 * (np.ones_like(p))  # anything below -150dB is irrelevant
plt.fill_between(f, p, fill_to )
plt.xlim([f[2], f[-1]])
plt.ylim([-90, 6])
# plt.xscale('log')   # uncomment if you want log scale on x-axis
plt.xlabel('f, Hz')
plt.ylabel('Power spectrum, dB')
plt.grid(True)
plt.show()

Some necessary explanations on parameters:

wave file is read as 16-bit PCM, in order to be compatible with Audacity it should be scaled to be |A|<1.0
segment_size is corresponding to Size in Audacity's GUI.
default window type is 'Hanning', you can change it if you want.
overlap is segment_size/2 as in Audacity code.
output window is framed to follow Audacity style. They throw away first low frequency bins and cut everything below -90dB

enter image description here

What's physical meaning of the amplitude in the second picture?

It is basically amount of energy in the frequency bin.

How to normalize the amplitude to 0dB like the one in Audacity?

You need choose some reference point. Graphs in decibels are always relevant to something. When you select maximum energy bin as a reference, your 0db point is the maximum energy (obviously). It is acceptable to set as a reference energy of the sine wave with maximum amplitude. See ref variable. Power in sinusoidal signal is simply squared RMS, and to get RMS, you just need to divide amplitude by sqrt(2). So the scaling factor is simply 0.5. Please note that factor before log10 is 10 and not 20, this is because we are dealing with power of signal and not amplitude.

Can I treat the amplitude below -90dB so low that it can be ignored?

Yes, anything below -40dB is usually considered negligeble

177

answered Sep 22 '22 09:09

igrinis

Related questions
                            
                                Numba vs Cython loop optimization
                            
                                django rest framework list query customize json array result response because of date formatting
                            
                                How can I tell if a tf op has a gradient or not?
                            
                                How to append data to YAML file
                            
                                Paradoxical behaviour of math.nan when combined with the 'in' operator
                            
                                Install keras and tensorflow using Rstudio
                            
                                IOError: [Errno 13] Permission denied: 'geckodriver.log when running Python/Selenium
                            
                                Is there a way to prevent SMTP Connection Timeout? smtplib, python
                            
                                Why does this query give different results depending on how I arrange my DateTime arithmetic?
                            
                                Simple hash of PIL image
                            
                                Django SearchVector using icontains
                            
                                How to manage two pip versions in conda?
                            
                                Numpy find indices of groups with same value
                            
                                Tensorflow hashtable lookup with arrays
                            
                                Merging pandas dataframes on 2 columns but in either order
                            
                                Python - isinstance() not working as I'd expect
                            
                                what does it mean by 'passed by assignment'?
                            
                                Add a signature, with annotations, to extension methods
                            
                                Write pandas dataframe to Excel with xlsxwriter and include `write_rich_string` formatting
                            
                                How to document the post body using flask-ReSTplus?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Normalizing FFT spectrum magnitude to 0dB

Tags:

python

frequency

fft

spectrum

audacity

WangYudong

People also ask

1 Answers

igrinis

Recent Activity

Donate For Us