Distinguishing instruments in a music file [closed]

Tags:

language-agnostic

Given a music file, is it possible to split out each instrument that is being played? I.e. let's say I have someSong.mp3, and in that song there's vocals, guitar, bass and drums. I'd want to get 4 "tracks" - one for each distinct instrument.

I'm guessing that it's almost impossible to do this, given that instruments can overlap, and it's notoriously difficult to distinguish overlapping voices let alone instruments.

However is there is a library, or an algorithm, or SOME way of doing this, I'd be curious to hear how.

408

asked Mar 30 '09 17:03

FreeMemory

2 Answers

My undergraduate project dealt with transcribing notes from a WAV file to a MIDI file. We handled only the simple case of one instrument, possibly playing more than one note at a time (a piano, for instance). Our research into the subject before we started showed that even this (i.e. only one instrument) is considered non-trivial. Basically, the problem is:

find what frequencies are playing at any given time. This can be done by a DFT/FFT of small windows one at a time.
Use some heuristic to guess which frequencies are harmonies of the same note, and which belong to different notes. This may be easy if you know what instrument is playing, but it's hard in the general case, because the magnitudes of each harmony differ by instrument. For instance, you might have two Cs one octave apart from one instrument, or you might have one C but from a different instrument.
after you know what notes are playing at each time, you have to guess when you have breaks in the notes. You could have one long note or a series of short notes. Depending on the size of the windows you used for the initial DFT, you could have different results here.

Now, if you have more than one instrument at a time, and no two are playing the same notes or harmonies thereof at one time, you might be able to tell the instruments apart using some heuristic on the magnitudes of the harmonies or on the sequences of notes they're playing. Most likely there will be times when two instruments are playing the same note. Then you don't really have any way to decide if there is (a) one instrument playing the note, (b) two instruments playing at the same volume, (c) one playing soft and the other playing loud or (d) any combination thereof.

Anyway, that's the short list of problems to solve. I don't know of any algorithm that solves this in the general case. I don't think this problem has been solved yet.

Edit: My project presentation can be found at http://www-sipl.technion.ac.il/new/Archive/Special_Events/sipl2004/Projects_PowerPoint/WAV-to-MIDI.pdf

144

answered Oct 06 '22 00:10

Nathan Fellman

I have actually bumped into a very interesting algorithm called ICA (Independent Component Analysis). The concept behind this algorithm doesn't come from the signal processing world, but from probabilistic theories. We used it to separate two songs that were mixed into single mp3 file. You can find an implementation library in Matlab \ C++ \ Python called FastICA here. Give it a shot it's really nice.

answered Oct 05 '22 22:10

LiorH

Related questions
                            
                                How strict should I be in the "do the simplest thing that could possible work" while doing TDD
                            
                                How to design a data structure that allows one to search, insert and delete an integer X in O(1) time
                            
                                Apply PCA on very large sparse matrix
                            
                                Are fixed-width integers distributive over multiplication?
                            
                                Why are leading zeroes used to represent octal numbers?
                            
                                Basic programming/algorithmic concepts [closed]
                            
                                Is there an algorithm for weighted reservoir sampling? [closed]
                            
                                Is my understanding of type systems correct?
                            
                                What is an integer overflow error?
                            
                                Usage of HACK and UNDONE comment tags
                            
                                Why can you only prepend to lists in functional languages?
                            
                                Where is the difference between "binaries" and "executables" in the context of an executable program?
                            
                                Using "friend"-declarations for unit testing. Bad idea?
                            
                                Is TimeSpan unnecessary?
                            
                                How do game companies handle programming for multiple platforms?
                            
                                How do you get non-technical folks to appreciate a non-UI problem? [closed]
                            
                                What are the best practices to log an error?
                            
                                Can you explain this thing about encapsulation?
                            
                                Is it possible to enumerate computer programs?
                            
                                percentage difference between two text files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With