Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access MP3 music data using Python

Tags:

python

mp3

I'm trying to write a Python script for searching out duplicate mp3/4 files using the song's data as the base for comparison. My situation involves many mp3/4 files with similar file names, but different ID3 tags. At first I tried looping through and using md5 to find duplicate files (ignoring file names). This, of course, didn't work when the ID3 tags didn't match.

As a result, I'm looking for a way to extract only the music data from an mp3/4 in order to run it through md5 and find any duplicates. What is the best way to go about this?

like image 459
Jack M. Avatar asked Jul 14 '10 21:07

Jack M.


1 Answers

Try using id3-py or mutagen to strip out all the tags (both ID3v1 and ID3v2, they can both be on the same file), then computing the MD5 on the result.

Assuming iTunes didn't manipulate the file beyond tags they should be identical. Transcoding obviously would make this approach invalid.

like image 89
Nick T Avatar answered Sep 24 '22 00:09

Nick T