I've been working on a frequency shifter using a primitive FFT algorithm supplied by Rosetta Code. I understand that to frequency shift a signal of samples, one applies an FFT to the original audio, multiplies the frequency of each resulting sine-wave by the frequency-shift factor (user defined), and then adds the sine-waves back together. When I run my algorithm, the output is of extremely low quality, as though there were not enough sine waves collected within the algorithm to have reproduced the signal close to correctly in the first place. The algorithm is implemented in a class in a header file and called (correctly) elsewhere.
#include <complex>
#include <valarray>
typedef std::complex<double> Complex;
typedef std::valarray<Complex> CArray;
class FrequencyShifter {
float sampleRate;
public:
FrequencyShifter() {
}
void setSampleRate(float inSampleRate) {
sampleRate = inSampleRate;
}
double abs(double in0) {
if (in0>=0) return in0;
else return -in0;
}
void fft(CArray& x)
{
const size_t N = x.size();
if (N <= 1) return;
// divide
CArray even = x[std::slice(0, N/2, 2)];
CArray odd = x[std::slice(1, N/2, 2)];
// conquer
fft(even);
fft(odd);
// combine
for (size_t k = 0; k < N/2; ++k)
{
Complex t = std::polar(1.0, -2 * PI * k / N) * odd[k];
x[k ] = even[k] + t;
x[k+N/2] = even[k] - t;
}
}
double convertToReal(double im, double re) {
return sqrt(abs(im*im - re*re));
}
void processBlock(float *inBlock, const int inFramesToProcess, float scale) {
//inFramesToProcess is the amount of samples in inBlock
Complex *copy = new Complex[inFramesToProcess];
for (int frame = 0; frame<inFramesToProcess; frame++) {
copy[frame] = Complex((double)inBlock[frame], 0.0);
}
CArray data(copy, inFramesToProcess);
fft(data);
const float freqoffsets = sampleRate/inFramesToProcess;
for (float x = 0; x<data.size()/2; x++) {
for (float frame = 0; frame<inFramesToProcess; frame++) {
inBlock[(int)frame] = (float)(convertToReal(data[(int)x].imag(), data[(int)x].real())*sin(freqoffsets*x*frame*scale));
}
}
}
};
I'm assuming that part of the problem is that I'm only including sampleRate/inFramesToProcess
frequencies for the sine waves to cover. Would sending larger audio files (thus larger *inBlock
s and inFramesToProcess
s) make the audio less grainy? How would I accomplish this without just changing the values or lengths of the arguments?
Statement – Frequency shifting property of Fourier transform states that the multiplication of a time domain signal x(t) by an exponential (ejω0t) causes the frequency spectrum to be shifted by ω0.
frequency bins are intervals between samples in frequency domain. For example, if your sample rate is 100 Hz and your FFT size is 100, then you have 100 points between [0 100) Hz. Therefore, you divide the entire 100 Hz range into 100 intervals, like 0-1 Hz, 1-2 Hz, and so on.
Y = fftshift( X ) rearranges a Fourier transform X by shifting the zero-frequency component to the center of the array. If X is a vector, then fftshift swaps the left and right halves of X . If X is a matrix, then fftshift swaps the first quadrant of X with the third, and the second quadrant with the fourth.
Here is an updated version of processBlock
with a number of tweaks required to implement the frequency shift, which I will describe below:
void processBlock(float *inBlock, const int inFramesToProcess, float scale) {
//inFramesToProcess is the amount of samples in inBlock
Complex *copy = new Complex[inFramesToProcess];
for (int frame = 0; frame<inFramesToProcess; frame++) {
copy[frame] = Complex((double)inBlock[frame], 0.0);
}
CArray data(copy, inFramesToProcess);
fft(data);
const float freqoffsets = 2.0*PI/inFramesToProcess;
const float normfactor = 2.0/inFramesToProcess;
for (int frame = 0; frame<inFramesToProcess; frame++) {
inBlock[frame] = 0.5*data[0].real();
for (int x = 1; x<data.size()/2; x++) {
float arg = freqoffsets*x*frame*scale;
inBlock[frame] += data[x].real()*cos(arg) - data[x].imag()*sin(arg);
}
inBlock[frame] *= normfactor;
}
}
Derivation
The spectrum you get from the FFT is complex-valued, which could be seen as providing a representation of your signals in terms of sine and cosine waves. Reconstructing the time-domain waveform can be done using the inverse transform, which would be given by the relation:
Taking advantage of the frequency spectrum symmetry, this can be expressed as:
or equivalently:
As you might have noticed the term at index 0
and N/2
are special cases with purely real coefficients in the frequency domain. For simplicity, assuming the spectrum does not go all the way to N/2
, you could drop that N/2
term and still get a reasonable approximation. For the other terms you would get a contribution which can be implemented as
normfactor = 2.0/inFramesToProcess;
normfactor*(data[x].real()*cos(arg) - data[x].imag()*sin(arg))
You would of course need to add all these contributions into the final buffer inBlock[frame]
, and not simply overwriting previous results:
inBlock[frame] += normfactor*(data[x].real()*cos(arg) - data[x].imag()*sin(arg));
// ^^
Note that the normalization can be done on the final result after the loop to reduce the number of multiplications. In doing so, we must pay special attention to the DC term at index 0 (which has a coefficient of 1/N
instead of 2/N
):
inBlock[frame] = 0.5*data[0].real();
for (int x = 1; x<data.size()/2; x++) {
float arg = freqoffsets*x*frame*scale;
inBlock[frame] += data[x].real()*cos(arg) - data[x].imag()*sin(arg);
}
inBlock[frame] *= normfactor;
Finally, when generating the tones, the phase argument arg
to sin
and cos
should be of the form 2*pi*k*n/inFramesToProcess
(prior to the application of the scale
factor), where n
is the time-domain sample index and k
is the frequency domain index. The end result is that the computed frequency increment freqoffsets
should really be 2.0*PI/inFramesToProcess
.
Notes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With