My large (120gb) music collection contains many duplicate songs, and I've been trying to fingerprint tracks in the hopes of detecting duplicates. And since I'm a CS Major I'm very curious as to what is done out there? Nothing I do has nearly the accuracy of something like Shazam or Lala.com. How do they "hash" tracks? I have run a standard MD5 hash on all my files (26,000 files) and I found hundreds of equal hashes on different tracks, so that doesn't work.
I'm more interested in Lala.com since they work with full files, unlike Shazam, but I'm assuming both use a similar technique. Can anyone explain how to generate unique identifiers for music?
An audio fingerprint is a condensed digital summary of an audio signal, generated by extracting acoustic relevant characteristics of a piece of audio content. Along with matching algorithms, this digital signature permits to identify different versions of a single recording with the same title.
The Shazam algorithm distills samples of a song into fingerprints, and matches these fingerprints against fingerprints from known songs, taking into account their timing relative to each other within a song.
All you need to do is record a few seconds of the song on the app. The Shazam app uses sophisticated audio recognition technology to identify the music you hear in a matter of seconds so you can find out the name of the artist and track, watch videos, and even buy or stream the song on your device.
First off, the actual audio files are not what is being searched when you Shazam a song. Instead, Shazam has an audio fingerprint for each audio file in the database. The recording that a Shazam user submits is also made into an audio fingerprint which allows them to make comparisons accurately and quickly.
The seminal paper on audio fingerprinting is the work by Haitsma and Kalker in 2002-03. For each frame of audio, it preprocesses (differences across time frames and frequency bands) and then stores a binarized version of the frame's spectrum.
This procedure adds robustness. If the entire signal is shifted in time, it still works (at least, one can derive a lower bound on performance degradation). It is pretty robust to environmental noise. Since its inception, there have been many papers on low-level music similarity, so there is no single answer.
Do you have absolutely identical files, i.e., the signals are time aligned, bit depth is the same, sampling rate is the same? Then I would think a hash like MD5 should work. But if any of those parameters are changed, so will the hashes. In such an event, a procedure like the one mentioned earlier would work better.
Take a look at the ISMIR proceedings available free online. Fun stuff. http://www.ismir.net/
There are a lot of algorithms for acoustic fingerprinting. Some of the more popular ones are:
In fact libfooId is opensource , so you can check out its code in google-code!!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With