I've created 2 functions : - One that records the microphone - One that plays the sound of the microphone
It records the microphone for 3 seconds
#include <iostream>
#include <Windows.h>
#include <vector>
using namespace std;
#pragma comment(lib, "winmm.lib")
short int waveIn[44100 * 3];
void PlayRecord();
void StartRecord()
{
const int NUMPTS = 44100 * 3; // 3 seconds
int sampleRate = 44100;
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN hWaveIn;
MMRESULT result;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM; // simple, uncompressed format
pFormat.nChannels=1; // 1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate; // 44100
pFormat.nAvgBytesPerSec=sampleRate*2; // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2; // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16; // 16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
result = waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat,
0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Insert a wave input buffer
result = waveInAddBuffer(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Commence sampling input
result = waveInStart(hWaveIn);
cout << "recording..." << endl;
Sleep(3 * 1000);
// Wait until finished recording
waveInClose(hWaveIn);
PlayRecord();
}
void PlayRecord()
{
const int NUMPTS = 44100 * 3; // 3 seconds
int sampleRate = 44100;
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN hWaveIn;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM; // simple, uncompressed format
pFormat.nChannels=1; // 1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate; // 44100
pFormat.nAvgBytesPerSec=sampleRate*2; // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2; // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16; // 16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat, 0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
HWAVEOUT hWaveOut;
cout << "playing..." << endl;
waveOutOpen(&hWaveOut, WAVE_MAPPER, &pFormat, 0, 0, WAVE_FORMAT_DIRECT);
waveOutWrite(hWaveOut, &WaveInHdr, sizeof(WaveInHdr)); // Playing the data
Sleep(3 * 1000); //Sleep for as long as there was recorded
waveInClose(hWaveIn);
waveOutClose(hWaveOut);
}
int main()
{
StartRecord();
return 0;
}
How can I change my StartRecord function (and I guess my PlayRecord function aswell), to make it record untill theres no input from the microphone?
(So far, those 2 functions are working perfectly - records the microphone for 3 seconds, then plays the recording)...
Thanks!
Edit: by no sound, I mean the sound level is too low or something (means the person probably isnt speaking)...
Because sound is a wave, it oscillates between high and low pressures. This waveform is usually recorded as positive and negative numbers, with zero being the neutral pressure. If you take the absolute value of the signal and keep a running average it should be sufficient.
The average should be taken over a long enough period that you account for the appropriate amount of silence. A very cheap way to keep an estimate of the running average is like this:
const double threshold = 50; // Whatever threshold you need
const int max_samples = 10000; // The representative running average size
double average = 0; // The running average
int sample_count = 0; // When we are building the average
while( sample_count < max_samples || average > threshold ) {
// New sample arrives, stored in 'sample'
// Adjust the running absolute average
if( sample_count < max_samples ) sample_count++;
average *= double(sample_count-1) / sample_count;
average += std::abs(sample) / sample_count;
}
The larger max_samples
, the slower average
will respond to a signal. After the sound stops, it will slowly trail off. However, it will be slow to rise again too. This would be fine for reasonably continuous sound.
With something like speech, which can have short or long pauses, you may want to use an impulse-based approach. You can just define the number of samples of 'silence' that you expect, and reset it whenever you receive an impulse that exceeds the threshold. Using the running average above with a much shorter window size will give you a simple way of detecting an impulse. Then you just need to count...
const int max_samples = 100; // Smaller window size for impulse
const int max_silence_samples = 10000; // Maximum samples below threshold
int silence = 0; // Number of samples below threshold
while( silence < max_silence_samples ) {
// Compute running average as before
//...
// Check for silence. If there's a signal, reset the counter.
if( average > threshold ) silence = 0;
else ++silence;
}
Adjusting threshold
and max_samples
will control the sensitivity to pops and clicks, while max_silence_samples
gives you control over how much silence is allowed before you stop recording.
There are undoubtedly more technical ways to achieve your goals, but it's always good to try the simple one first. See how you go with this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With