I'm implementing a VOIP application that uses pure Java. There is an echo problem that occurs when users do not use headsets (mostly on laptops with built-in microphones).
What currently happens
The nuts and bolts of the VOIP application is just the plain datalines of Java's media framework. Essentially, I'd like to perform some digital signal processing on audio data before I write it to the speaker for output.
public synchronized void addAudioData(byte[] ayAudioData)
{
m_oBuffer.enqueue(ayAudioData);
this.notify();
}
As you can see the audio data arrives and is enqueued in a buffer. This is to cater for dodgy connections and to allow for different packet sizes. It also means I have access to as much audio data as I need for any fancy DSP operations before I play the audio data to the speaker line.
I've managed one echo canceller that does work, however it requires a lot of interactive user input and I'd like to have an automatic echo canceller.
Manual echo canceller
public static byte[] removeEcho(int iDelaySamples, float fDecay, byte[] aySamples)
{
m_awDelayBuffer = new short[iDelaySamples];
m_aySamples = new byte[aySamples.length];
m_fDecay = (float) fDecay;
System.out.println("Removing echo");
m_iDelayIndex = 0;
System.out.println("Sample length:\t" + aySamples.length);
for (int i = 0; i < aySamples.length; i += 2)
{
// update the sample
short wOldSample = getSample(aySamples, i);
// remove the echo
short wNewSample = (short) (wOldSample - fDecay * m_awDelayBuffer[m_iDelayIndex]);
setSample(m_aySamples, i, wNewSample);
// update the delay buffer
m_awDelayBuffer[m_iDelayIndex] = wNewSample;
m_iDelayIndex++;
if (m_iDelayIndex == m_awDelayBuffer.length)
{
m_iDelayIndex = 0;
}
}
return m_aySamples;
}
Adaptive filters
I've read that adaptive filters are the way to go. Specifically, a Least Mean Squares filter. However, I'm stuck. Most sample code for the above are in C and C++ and they don't translate well into Java.
Does anyone have advice on how to implement them in Java? Any other ideas would also be greatly appreciated. Thanks in advance.
The Acoustic Echo Cancellation (AEC) block is designed to remove echoes, reverberation, and unwanted added sounds from a signal that passes through an acoustic space.
The algorithm has the ability to suppress the noise from the microphone signal as well as the acoustic echo. The audio echo cancellation is based on transform domain LMS adaptive filter theory. A noise reduction filter is introduced in order to achieve better performance.
It's been ages! Hope this is even the right class, but there you go:
/**
* This filter performs a pre-whitening Normalised Least Means Square on an
* array of bytes. This does the actual echo cancelling.
*
* Echo cancellation occurs with the following formula:
*
* e = d - X' * W
*
* e represents the echo-free signal. d represents the actual microphone signal
* with the echo. X' is the transpose of the loudspeaker signal. W is an array
* of adaptive weights.
*
*/
public class cNormalisedLeastMeansSquareFilter
implements IFilter
{
private byte[] m_ayEchoFreeSignal;// e
private byte[] m_ayEchoSignal;// d
private byte[] m_ayTransposeOfSpeakerSignal;// X'
private double[] m_adWeights;// W
/**
* The transpose and the weights need to be updated before applying the filter
* to an echo signal again.
*
* @param ayEchoSignal
* @param ayTransposeOfSpeakerSignal
* @param adWeights
*/
public cNormalisedLeastMeansSquareFilter(byte[] ayEchoSignal, byte[] ayTransposeOfSpeakerSignal, double[] adWeights)
{
m_ayEchoSignal = ayEchoSignal;
m_ayTransposeOfSpeakerSignal = ayTransposeOfSpeakerSignal;
m_adWeights = adWeights;
}
@Override
public byte[] applyFilter(byte[] ayAudioBytes)
{
// e = d - X' * W
m_ayEchoFreeSignal = new byte[ayAudioBytes.length];
for (int i = 0; i < m_ayEchoFreeSignal.length; ++i)
{
m_ayEchoFreeSignal[i] = (byte) (m_ayEchoSignal[i] - m_ayTransposeOfSpeakerSignal[i] * m_adWeights[i]);
}
return m_ayEchoFreeSignal;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With