I want to write/program/develop an algorithm that can recognize many characteristics in data from a line in/mic audio. The audio stream will be music and I want to filter out characteristics to distinguish songs from each other, by distinguishing I mean that you can call the genres of the songs apart.
One crucial thing that I absolutely want to detect is what kind of bar/beat the song has. For example I want to know if the song is in a 3/4 time.
The only helpful articles that I found were about BPM detection but that is not enough to distinguish a song from another song.
The FFT is a good start to get different characteristics from an audio stream but I don’t know where to begin. Is it possible to get the bar/beat with the FFT? Are there any good tutorials/code examples about this?
Is the FFT enough to get good characteristics of an audio stream or are there any other algorithms that are good for getting characteristics in audio streams?
Preferably I would do this in C# because that’s the programming language I have most experience with. Is this possible in C# or is another language better?
To sum my question up, I’m looking for any information about finding characteristics in an audio stream to get the beat/bar and other information to distinguish songs.
I enjoyed reading the related articles by this blogger:
http://www.redcode.nl/blog/2010/06/creating-shazam-in-java/
The author discusses fingerprinting songs. If you labelled a set of songs as having the qualities you're looking for and then fed the data into some kind of learning algorithm/classifier, you may have some success.
I do not think this is a solved problem, and so giving you a categorical answer is not possible, as far as I know.
Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With