How to convert a .wav file to a spectrogram in python3

Tags:

I am trying to create a spectrogram from a .wav file in python3.

I want the final saved image to look similar to this image:

source: imgur.com

I have tried the following:

This stack overflow post: Spectrogram of a wave file

This post worked, somewhat. After running it, I got

source: imgur.com

However, This graph does not contain the colors that I need. I need a spectrogram that has colors. I tried to tinker with this code to try and add the colors however after spending significant time and effort on this, I couldn't figure it out!

I then tried this tutorial.

This code crashed(on line 17) when I tried to run it with the error TypeError: 'numpy.float64' object cannot be interpreted as an integer.

line 17:

samples = np.append(np.zeros(np.floor(frameSize/2.0)), sig)

I tried to fix it by casting

samples = int(np.append(np.zeros(np.floor(frameSize/2.0)), sig))

and I also tried

samples = np.append(np.zeros(int(np.floor(frameSize/2.0)), sig))

However neither of these worked in the end.

I would really like to know how to convert my .wav files to spectrograms with color so that I can analyze them! Any help would be appreciated!!!!!

Please tell me if you want me to provide any more information about my version of python, what I tried, or what I want to achieve.

740

asked Jun 27 '17 18:06

Sreehari R

2 Answers

Use scipy.signal.spectrogram.

import matplotlib.pyplot as plt from scipy import signal from scipy.io import wavfile  sample_rate, samples = wavfile.read('path-to-mono-audio-file.wav') frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate)  plt.pcolormesh(times, frequencies, spectrogram) plt.imshow(spectrogram) plt.ylabel('Frequency [Hz]') plt.xlabel('Time [sec]') plt.show()

Be sure that your wav file is mono (single channel) and not stereo (dual channel) before trying to do this. I highly recommend reading the scipy documentation at https://docs.scipy.org/doc/scipy- 0.19.0/reference/generated/scipy.signal.spectrogram.html.

Putting plt.pcolormesh before plt.imshow seems to fix some issues, as pointed out by @Davidjb, and if unpacking error occurs, follow the steps by @cgnorthcutt below.

149

answered Oct 05 '22 02:10

Tom Wyllie

I have fixed the errors you are facing for http://www.frank-zalkow.de/en/code-snippets/create-audio-spectrograms-with-python.html
This implementation is better because you can change the binsize (e.g. binsize=2**8)

import numpy as np from matplotlib import pyplot as plt import scipy.io.wavfile as wav from numpy.lib import stride_tricks  """ short time fourier transform of audio signal """ def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):     win = window(frameSize)     hopSize = int(frameSize - np.floor(overlapFac * frameSize))      # zeros at beginning (thus center of 1st window should be for sample nr. 0)        samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)         # cols for windowing     cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1     # zeros at end (thus samples can be fully covered by frames)     samples = np.append(samples, np.zeros(frameSize))      frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()     frames *= win      return np.fft.rfft(frames)      """ scale frequency axis logarithmically """     def logscale_spec(spec, sr=44100, factor=20.):     timebins, freqbins = np.shape(spec)      scale = np.linspace(0, 1, freqbins) ** factor     scale *= (freqbins-1)/max(scale)     scale = np.unique(np.round(scale))      # create spectrogram with new freq bins     newspec = np.complex128(np.zeros([timebins, len(scale)]))     for i in range(0, len(scale)):                 if i == len(scale)-1:             newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)         else:                     newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)      # list center freq of bins     allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])     freqs = []     for i in range(0, len(scale)):         if i == len(scale)-1:             freqs += [np.mean(allfreqs[int(scale[i]):])]         else:             freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]      return newspec, freqs  """ plot spectrogram""" def plotstft(audiopath, binsize=2**10, plotpath=None, colormap="jet"):     samplerate, samples = wav.read(audiopath)      s = stft(samples, binsize)      sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)      ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel      timebins, freqbins = np.shape(ims)      print("timebins: ", timebins)     print("freqbins: ", freqbins)      plt.figure(figsize=(15, 7.5))     plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")     plt.colorbar()      plt.xlabel("time (s)")     plt.ylabel("frequency (hz)")     plt.xlim([0, timebins-1])     plt.ylim([0, freqbins])      xlocs = np.float32(np.linspace(0, timebins-1, 5))     plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])     ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))     plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])      if plotpath:         plt.savefig(plotpath, bbox_inches="tight")     else:         plt.show()      plt.clf()      return ims  ims = plotstft(filepath)

answered Oct 05 '22 02:10

Beginner

Related questions
                            
                                ModuleNotFoundError: No module named 'sklearn.externals.six'
                            
                                Is there a way to decode numerical COM error-codes in pywin32
                            
                                Python xlwt - accessing existing cell content, auto-adjust column width
                            
                                dict.get() - default arg evaluated even upon success
                            
                                Use "contains" and "iexact" at the same query in DJANGO
                            
                                Static classes in Python
                            
                                How can I change the font size of ticks of axes object in matplotlib [duplicate]
                            
                                Finding consecutive segments in a pandas data frame
                            
                                Recursive os.listdir? [duplicate]
                            
                                How to find linearly independent rows from a matrix
                            
                                Uploading a file to a S3 bucket with a prefix using Boto3
                            
                                Django Vote Up/Down method [closed]
                            
                                Limiting Memory Use in a *Large* Django QuerySet
                            
                                Which command to use for checking whether python is 64bit or 32bit
                            
                                predict_proba for a cross-validated model
                            
                                pandas: how do I select first row in each GROUP BY group?
                            
                                What's the difference between list() and [] [duplicate]
                            
                                How do I avoid KeyError when working with dictionaries?
                            
                                RNN Regularization: Which Component to Regularize?
                            
                                How to map numeric data into categories / bins in Pandas dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to convert a .wav file to a spectrogram in python3

Tags:

python

matplotlib

numpy

audio

spectrogram

Sreehari R

People also ask

2 Answers

Tom Wyllie

Beginner

Recent Activity

Donate For Us