I'm looking to log events corresponding to a specific sound, such as a car door slamming, or perhaps a toaster ejecting toast.
The system needs to be more sophisticated than a "loud noise detector"; it needs to be able to distinguish that specific sound from other loud noises.
The identification need not be zero-latency, but the processor needs to keep up with a continuous stream of incoming data from a microphone that is always on.
This answer indicates that a matched filter would be appropriate, but I am hazy on the details. I don't believe a simple cross-correlation on the audio waveform data between a sample of the target sound and the microphone stream would be effective, due to variations in the target sound.
My question is also similar to this, which didn't get much attention.
I found an interesting paper on the subject
It should also work for your application, if not better than for vehicle sounds.
When analyzing the training data, it...
Does a Principal Component Analysis on the frequency vectors
Then to classify a sound, it...
This doctoral thesis, Non-Speech Environmental Sound Classification System for Autonomous Surveillance, by Cowling (2004), has experimental results on different techniques for audio feature extraction, as well as classification. He uses environmental sounds such as jangling keys and footsteps, and was able to achieve an accuracy of 70%:
The best technique is found to be either Continuous Wavelet Transform feature extraction with Dynamic Time Warping or Mel-Frequency Cepstral Coefficients with Dynamic Time Warping. Both of these techniques achieve a 70% recognition rate.
If you limit yourself to one sound, perhaps you might be able to achieve a higher recognition rate?
The author also mentions that techniques that work fairly well with speech recognition (learning vector quantization and neural networks) don't work so well with environmental sounds.
I have also found a more recent article here: Detecting Audio Events for Semantic Video Search, by Bugalho et al. (2009), where they detect sound events in movies (like gun shots, explosions, etc).
I have no experience in this area. I have merely stumbled upon this material as a result of your question piquing my interest. I'm posting my finds here in the hope that it helps with your research.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With