Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which algorithm should I use for signal (sound) one class classification?

Update this question was previously titled as "Give me the name of a simple algorithm for signal(sound) pattern detection"

  1. My objective is to detect the presence of a given pattern in a noisy signal. I want to detect the presence of a species of insect recording the sounds with a microphone. I have previously recorded the sound of the insect in a digital format.
  2. I am not trying to do voice recognition.
  3. I am already using convolution between the input signal and the pattern to determine their similarity level. But I think that this technique is more suited to discrete time (i.e. digital communications, where signals occurs at fixed intervals) and to distinguish an input signal between 2 given patterns (I have only one pattern).
  4. I am afraid to use neural networks, because I never used them, and I don't know if I could embed that code.

Could you please point me some other approaches, or try to convince me that my current approach still is a good idea or that neural networks may be a feasible way?

Update I already have 2 good answers, but another one would be welcome, and even rewarded.

like image 326
Jader Dias Avatar asked Jan 14 '09 00:01

Jader Dias


People also ask

Which algorithm is best for audio classification?

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio.

How do you classify an audio signal?

Audio Classification We will start with sound files, convert them into spectrograms, input them into a CNN plus Linear Classifier model, and produce predictions about the class to which the sound belongs. There are many suitable datasets available for sounds of different types.

Can we use classification algorithm to detect outliers?

One-Class SVM Algorithm It is very sensitive to outliers. Therefore, it is not very good for outlier detection, but the best option for novelty detection when the training data is not much polluted with outliers.

What is unary classification?

In machine learning, one-class classification (OCC), also known as unary classification or class-modelling, tries to identify objects of a specific class amongst all objects, by primarily learning from a training set containing only the objects of that class, although there exist variants of one-class classifiers where ...


2 Answers

A step up from convolution is dynamic time warping which can be thought of as a convolution operator that stretches and shrinks one signal to optimally match another.

Perhaps a simpler approach would be to do an FFT of the sample and determine if your insect any particular frequencies that can be filtered on.

On the more complex side, but not quite a neural network, are SVM toolkits like libsvm and svmlight that you can throw your data at.

Regardless of the path you attempt, I would spend time exploring the nature of the sound your insect makes using tools like FFT. After all, it will be easier teaching a computer to classify the sound if you can do it yourself.

like image 106
Paul Avatar answered Sep 20 '22 19:09

Paul


Sound like a typical one class classification problem i.e. you want to search one thing in a large pool of other things you don't care about.

What you want to do is find a set of features or descriptors that you can calculate for every short piece of your raw recording that you can then match against the features your clean recording produces. I don't think convolution is neccessarily bad, though it is rather sensitive to noise so it might not be optimal for your case. What might actually work in your case is pattern matching on a binned fourier transform. You take the fourier transform of your signal, giving you a power vs frequency graph (rather than a power vs time graph) then you divide the frequency in bands and you take the average power for each band as a feature. If your data contains mostly white noise the patern you get from a raw insect sound of similar length will very closely match the pattern of your reference sound. This last trick has been used succesfully (with some windowing) to crack audio captcha's as used by google et al to make their sites accessible to the blind.

By the way, because your raw audio signal is digital (otherwise processing with a computer will not work ;-)) convolution is appropriate. You should perform the convolution between your reference signal and a sample of equal length from the raw input starting from each sample. So, if your reference signal has length N, and your raw sample has length M where M>=N then you should perform M-N+1=P convolutions between your reference signal and P samples from your raw input starting at 1..P. The best possibility for the location of the reference sound in the raw sample is the sample with the highest convolution score. Note that this becomes insanely time consuming very quickly.

Fourier transform based matching as I explained above using 50% overlapping samples from your raw data of twice the length of your reference sample would at least be faster (though not neccessarily better)

like image 25
jilles de wit Avatar answered Sep 17 '22 19:09

jilles de wit