Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change pyplot.specgram x and y axis scaling?

I have never worked with audio signals before and little do I know about signal processing. Nevertheless, I need to represent and audio signal using pyplot.specgram function from matplotlib library. Here is how I do it.

import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile

rate, frames = wavfile.read("song.wav")
plt.specgram(frames)

The result I am getting is this nice spectrogram below: enter image description here

When I look at x-axis and y-axis which I suppose are frequency and time domains I can't get my head around the fact that frequency is scaled from 0 to 1.0 and time from 0 to 80k. What is the intuition behind it and, what's more important, how to represent it in a human friendly format such that frequency is 0 to 100k and time is in sec?

like image 873
minerals Avatar asked Dec 18 '22 22:12

minerals


2 Answers

As others have pointed out, you need to specify the sample rate, else you get a normalised frequency (between 0 and 1) and sample index (0 to 80k). Fortunately this is as simple as:

plt.specgram(frames, Fs=rate)

To expand on Nukolas answer and combining my Changing plot scale by a factor in matplotlib and matplotlib intelligent axis labels for timedelta we can not only get kHz on the frequency axis, but also minutes and seconds on the time axis.

import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile

cmap = plt.get_cmap('viridis') # this may fail on older versions of matplotlib
vmin = -40  # hide anything below -40 dB
cmap.set_under(color='k', alpha=None)

rate, frames = wavfile.read("song.wav")
fig, ax = plt.subplots()
pxx, freq, t, cax = ax.specgram(frames[:, 0], # first channel
                                Fs=rate,      # to get frequency axis in Hz
                                cmap=cmap, vmin=vmin)
cbar = fig.colorbar(cax)
cbar.set_label('Intensity dB')
ax.axis("tight")

# Prettify
import matplotlib
import datetime

ax.set_xlabel('time h:mm:ss')
ax.set_ylabel('frequency kHz')

scale = 1e3                     # KHz
ticks = matplotlib.ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale))
ax.yaxis.set_major_formatter(ticks)

def timeTicks(x, pos):
    d = datetime.timedelta(seconds=x)
    return str(d)
formatter = matplotlib.ticker.FuncFormatter(timeTicks)
ax.xaxis.set_major_formatter(formatter)
plt.show()

Result:

resulting image

like image 143
oystein Avatar answered Feb 15 '23 10:02

oystein


  • Firstly, a spectrogram is a representation of the spectral content of a signal as a function of time - this is a frequency-domain representation of the time-domain waveform (e.g. a sine wave, your file "song.wav" or some other arbitrary wave - that is, amplitude as a function of time).

  • The frequency values (y-axis, Hertz) are wholly dependant on the sampling frequency of your waveform ("song.wav") and will range from "0" to "sampling frequency / 2", with the upper limit being the "nyquist frequency" or "folding frequency" (https://en.wikipedia.org/wiki/Aliasing#Folding). The matplotlib specgram function will automatically determine the sampling frequency of the input waveform if it is not otherwise specified, which is defined as 1 / dt, with dt being the time interval between discrete samples of the waveform. You can can pass the option Fs='sampling rate' to the specgram function to manually define what it is. It will be easier for you to get your head around what is going on if you figure out and pass these variables to the specgram function yourself

  • The time values (x-axis, seconds) are purely dependent on the length of your "song.wav". You may notice some whitespace or padding if you use a large window length to calculate each spectra slice (think- the individual spectra which are arranged vertically and tiled horizontally to create the spectrogram image)

  • To make the axes more intuitive in the plot, use x- and y-axes labels and you can also scale the axes values (i.e. change the units) using a method similar to this

Take home message - try to be a bit more verbose with your code: see below for my example.

    import matplotlib.pyplot as plt
    import numpy as np

    # generate a 5Hz sine wave
    fs = 50
    t = np.arange(0, 5, 1.0/fs)
    f0 = 5
    phi = np.pi/2
    A = 1
    x = A * np.sin(2 * np.pi * f0 * t +phi)

    nfft = 25

    # plot x-t, time-domain, i.e. source waveform
    plt.subplot(211)
    plt.plot(t, x)
    plt.xlabel('time')
    plt.ylabel('amplitude')

    # plot power(f)-t, frequency-domain, i.e. spectrogram
    plt.subplot(212)
    # call specgram function, setting Fs (sampling frequency) 
    # and nfft (number of waveform samples, defining a time window, 
    # for which to compute the spectra)
    plt.specgram(x, Fs=fs, NFFT=nfft, noverlap=5, detrend='mean', mode='psd')
    plt.xlabel('time')
    plt.ylabel('frequency')
    plt.show()

5Hz_spectrogram:

enter image description here

like image 25
Nukolas Avatar answered Feb 15 '23 09:02

Nukolas