Fastest way to get average value of frequencies within range [closed]

Question

I am new in python as well as in signal processing. I am trying to calculate mean value among some frequency range of a signal.

What I am trying to do is as follows:

import numpy as np
data = <my 1d signal>
lF = <lower frequency>
uF = <upper frequency>
ps = np.abs(np.fft.fft(data)) ** 2 #array of power spectrum

time_step = 1.0 / 2000.0

freqs = np.fft.fftfreq(data.size, time_step) # array of frequencies
idx = np.argsort(freqs) # sorting frequencies

sum = 0
c =0
for i in idx:
    if (freqs[i] >= lF) and (freqs[i] <= uF) :
        sum += ps[i]
        c +=1
avgValue = sum/c
print 'mean value is=',avgValue

I think calculation is fine, but it takes a lot of time like for data of more than 15GB and processing time grows exponentially. Is there any fastest way available such that I would be able to get mean value of power spectrum within some frequency range in fastest manner. Thanks in advance.

EDIT 1

I followed this code for calculation of power spectrum.

EDIT 2

This doesn't answer to my question as it calculates mean over the whole array/list but I want mean over part of the array.

EDIT 3

Solution by jez of using mask reduces time. Actually I have more than 10 channels of 1D signal and I want to treat them in a same manner i.e. average frequencies in a range of each channel separately. I think python loops are slow. Is there any alternate for that? Like this:

for i in xrange(0,15):
        data = signals[:, i]
        ps = np.abs(np.fft.fft(data)) ** 2
        freqs = np.fft.fftfreq(data.size, time_step)
        mask = np.logical_and(freqs >= lF, freqs <= uF )
        avgValue = ps[mask].mean()
        print 'mean value is=',avgValue

jez · Accepted Answer

The following performs a mean over a selected region:

mask = numpy.logical_and( freqs >= lF, freqs <= uF )
avgValue = ps[ mask ].mean()

For proper scaling of power values that have been computed as abs(fft coefficients)**2, you will need to multiply by (2.0 / len(data))**2 (Parseval's theorem)

Note that it gets slightly fiddly if your frequency range includes the Nyquist frequency—for precise results, handling of that single frequency component would then need to depend on whether data.size is even or odd). So for simplicity, ensure that uF is strictly less than max(freqs). [For similar reasons you should ensure lF > 0.]

The reasons for this are tedious to explain and even more tedious to correct for, but basically: the DC component is represented once in the DFT, whereas most other frequency components are represented twice (positive frequency and negative frequency) at half-amplitude each time. The even-more-annoying exception is the Nyquist frequency which is represented once at full amplitude if the signal length is even, but twice at half amplitude if the signal length is odd. All of this would not affect you if you were averaging amplitude: in a linear system, being represented twice compensates for being at half amplitude. But you're averaging power, i.e. squaring the values before averaging, so this compensation doesn't work out.

I've pasted my code for grokking all of this. This code also shows how you can work with multiple signals stacked in one numpy array, which addresses your follow-up question about avoiding loops in the multi-channel case. Remember to supply the correct axis argument both to numpy.fft.fft() and to my fft2ap().

Dietrich · Answer

If you really have a signal of 15 GB size, you'll not be able to calculate the FFT in an acceptable time. You can avoid using the FFT, if it is acceptable for you to approximate your frequency range by a band pass filter. The justification is the Poisson summation formula, which states that sum of squares is not changed by a FFT (or: the power is preserved). Staying in the time domain will let the processing time rise proportionally to the signal length.

The following code designs a Butterworth band path filter, plots the filter response and filters a sample signal:

import numpy as np
import matplotlib.pyplot as plt
from scipy import signal

dd = np.random.randn(10**4)  # generate sample data
T = 1./2e3  # sampling interval
n, f_s = len(dd), 1./T  # number of points and sampling frequency

# design band path filter:
f_l, f_u = 50, 500  # Band from 50 Hz to 500 Hz
wp = np.array([f_l, f_u])*2/f_s  # normalized pass band frequnecies
ws = np.array([0.8*f_l, 1.2*f_u])*2/f_s  # normalized stop band frequencies
b, a = signal.iirdesign(wp, ws, gpass=60, gstop=80, ftype="butter",
                        analog=False)

# plot filter response:
w, h = signal.freqz(b, a, whole=False)
ff_w = w*f_s/(2*np.pi)
fg, ax = plt.subplots()
ax.set_title('Butterworth filter amplitude response')
ax.plot(ff_w, np.abs(h))
ax.set_ylabel('relative Amplitude')
ax.grid(True)
ax.set_xlabel('Frequency in Hertz')
fg.canvas.draw()


# do the filtering:
zi = signal.lfilter_zi(b, a)*dd[0]
dd1, _ = signal.lfilter(b, a, dd, zi=zi)

# calculate the avarage:
avg = np.mean(dd1**2)
print("RMS values is %g" % avg)
plt.show()

Read the documentation to Scipy's Filter design to learn how to modify the parameters of the filter.

If you want to stay with the FFT, read the docs on signal.welch and plt.psd. The Welch algorithm is a method to efficiently calculate the power spectral density of a signal (with some trade-offs).

Fastest way to get average value of frequencies within range [closed]

Tags:

performance

python

numpy

signal-processing

Muaz

2 Answers

jez

Dietrich

Recent Activity

Donate For Us

Fastest way to get average value of frequencies within range [closed]

Tags:

performance

python

numpy

signal-processing

Muaz

2 Answers

jez

Dietrich

Related questions

Recent Activity

Donate For Us