Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Issue implementing Energy threshold algorithm for Voice Activity Detection

I am trying to implement the energy threshold algorithm for voice activity detection and not getting meaningful values for energy for frames of size wL.

wL = 1784 // about 40 ms (
const double decay_constant = 0.90 // some optimal value between 0 and 1
double prevrms = 1.0  // avoid DivideByZero
double threshold = some optimal value after some experimentation

for (int i = 0; i < noSamples ; i += wL)
{
   for (int j = 0; j < wL; j++)
   {
     // Exponential decay
     total = total * decay_constant;
     total += (audioSample[j] * audioSample[j]); // sum of squares
   }

   double mean = total / wL;
   double rms = Math.Round(Math.Sqrt(mean),2); // root mean sqare
   double prevrms = 1.0;

   if(rms/prevrms > threshold)
   {
  // voice detected
   }

   prevrms = rms;
   rms = 0.0;
}

What is wrong with the above implementation? rms is computed for every frame as 0.19.

The other issue is speed, as it took about 30 minutes to execute the above. Currently the implementation is O(n2). I'm working with offline data so it's not as big a deal -- accuracy is the main objective -- but any suggestions to improve efficiency would be highly appreciated.

Also, should I use other factors like auto-correlation and zero-crossing rate, or is energy alone sufficient?

Following is the summary of the WAV file (only considering clean conversational speech) I am using:

// WAV file information
Sampling Frequency: 44100     Bits Per Sample:  16 
Channels: 2    nBlockAlign: 4   wavdata size: 557941248 bytes
Duration: 3162.932 sec    Samples: 139485312    Time between samples: 0.0227 ms
Byte position at start of samples: 44 bytes  (0x2C)

Chosen first sample to display:  1   (0.000 ms)
Chosen end  sample to display:  1784   (40.431 ms)

16 bit max possible value is:  32767  (0x7FFF)
16 bit min possible value is: -32768  (0x8000)
like image 731
user762519 Avatar asked Dec 14 '25 14:12

user762519


1 Answers

I have found the problem. My second for loop was not setup correctly. Basically, the second for loop should be something like this:

for(j = i; j <= i + wL ;j++)

Instead of:

for(j = 0; j < wL; j++)

Which was going over the same sample values over and over again.

like image 95
user762519 Avatar answered Dec 19 '25 07:12

user762519