The question is to look for any open source or simple implementation to measure how similar between two audios on the iOS application.
Simply speaking, audio can be represented by 1-D vector, to calculate the distance between the 1D vector. But the audio length will be different, therefore need some pre-processing etc.
Looking forward to get some clues here, thanks
The similarity between two sequences of variable length can be efficiently calculated with DTW:
http://en.wikipedia.org/wiki/Dynamic_time_warping
This algorithm is simple to implement yourself and there are quite many existing implementations linked on the wiki page.
Simply speaking, audio can represented by 1-D vector,
It's reasonable to split the audio on frames and turn it into 2-D vector of features where for each frame you have an array of values(features) corresponding to the different frequency bands. If you want to deal with music, an FFT for every frame is a good idea, for speech, it's better to calculate mel-frequency cepstrum
Again, you can use many existing libraries for mel frequency features, one of them is a speech recognition toolkit CMUSphinx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With