Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Audio spectrum extraction from audio file by python

Sorry if I submit a duplicate, but I wonder if there is any lib in python which makes you able to extract sound spectrum from audio files. I want to be able to take an audio file and write an algoritm which will return a set of data {TimeStampInFile; Frequency-Amplitude}.

I heard that this is usually called Beat Detection, but as far as I see beat detection is not a precise method, it is good only for visualisation, while I want to manipulate on the extracted data and then convert it back to an audio file. I don't need to do this real-time.

I will appreciate any suggestions and recommendations.

like image 301
Maksim Khaitovich Avatar asked Jun 24 '14 09:06

Maksim Khaitovich


People also ask

What features can be extracted from an audio file?

Audio applications that use such features include audio classification, speech recognition, automatic music tagging, audio segmentation and source separation, audio fingerprinting, audio denoising, music information retrieval, and more.


2 Answers

You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav

from scipy.io import wavfile # scipy library to read wav files
import numpy as np

AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)

# Plot the audio signal in time
import matplotlib.pyplot as plt
plt.plot(Audiodata)
plt.title('Audio signal in time',size=16)

# spectrum
from scipy.fftpack import fft # fourier transform
n = len(Audiodata) 
AudioFreq = fft(Audiodata)
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
MagFreq = np.abs(AudioFreq) # Magnitude
MagFreq = MagFreq / float(n)
# power spectrum
MagFreq = MagFreq**2
if n % 2 > 0: # ffte odd 
    MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
else:# fft even
    MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2 

plt.figure()
freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');


#Spectrogram
from scipy import signal
N = 512 #Number of point in the fft
f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
plt.figure()
plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
#plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [seg]')
plt.title('Spectrogram with scipy.signal',size=16);

plt.show()

i tested all the code and it works, you need, numpy, matplotlib and scipy.

cheers

like image 174
Jose R. Zapata Avatar answered Oct 17 '22 23:10

Jose R. Zapata


I think your question has three separate parts:

  1. How to load audio files into python?
  2. How to calculate spectrum in python?
  3. What to do with the spectrum?

1. How to load audio files in python?

You are probably best off by using scipy, as it provides a lot of signal processing functions. For loading audio files:

import scipy.io.wavfile

samplerate, data = scipy.io.wavfile.read("mywav.wav")

Now you have the sample rate (samples/s) in samplerate and data as a numpy.array in data. You may want to transform the data into floating point, depending on your application.

There is also a standard python module wave for loading wav-files, but numpy/scipy offers a simpler interface and more options for signal processing.

2. How to calculate the spectrum

Brief answer: Use FFT. For more words of wisdom, see:

Analyze audio using Fast Fourier Transform

Longer answer is quite long. Windowing is very important, otherwise you'll have strange spectra.

3. What to do with the spectrum

This is a bit more difficult. Filtering is often performed in time domain for longer signals. Maybe if you tell us what you want to accomplish, you'll receive a good answer for this one. Calculating the frequency spectrum is one thing, getting meaningful results with it in signal processing is a bit more complicated.

(I know you did not ask this one, but I see it coming with a probability >> 0. Of course, it may be that you have good knowledge on audio signal processing, in which case this is irrelevant.)

like image 35
DrV Avatar answered Oct 18 '22 00:10

DrV