I'm new to this field - but I need to perform a WAV-to-MIDI conversion in java. Is there a way to know what exactly are the steps involved in WAV-to-MIDI conversion? I have a very rough idea as in you need to; sample the wav file, filter it, use FFT for spectral analysis, feature extraction and then write the extracted features on to MIDI. But I cannot find solid sources or papers as in how to do all that? Can some one give me clues as in how and where to start? Are there any Open Source APIs available for this WAV-to-MIDI conversion process?
Advance thanks
Bear File Converter (Ofoct) The first one we will introduce is the Bear File Converter, one of the best free online tools. It is very simple to use this online tool to convert WAV files to MIDI. In addition to WAV, you can also convert various other audio formats such as MP3, OGG, AAC, WMA to MIDI.
Introducing Basic Pitch, Spotify's free open source tool for converting audio into MIDI. Basic Pitch uses machine learning to transcribe the musical notes in a recording. Drop a recording of almost any instrument, including your voice, then get back a MIDI version, just like that.
A WAV file is an uncompressed file format. The creators of this configuration were Microsoft and IBM. A MIDI file is made up strictly of audio sounds that are a remix for a digital format. This type of music file is smaller than other formats because it does not actually hold any music.
This is a field which is still highly under development, yet, there are some (experimental) algorithms available.
You can install sonic annotator and use a few vamp plugins.
For example:
./sonic-annotator file.wav -d vamp:qm-vamp-plugins:qm-transcription:transcription -w midi
./sonic-annotator file.wav -d vamp:silvet:silvet:notes -w midi
./sonic-annotator file.wav -d vamp:ua-vamp-plugins:mf0ua:mf0ua -w midi
It's a more involved process than you might imagine.
This research problem is often referred to as music transcription: the act of converting a low-level representation of music (e.g., waveform) into a higher-level representation such as MIDI or even sheet music.
The sophistication of your solution will depend upon the complexity of your input data. Tons of research papers address music transcription only on monophonic piano or drums... because they are easy to transcribe. (Relatively.) Violin is harder. Voice is even harder. Violin plus voice plus piano is much harder. A symphony is nearly impossible. You get the picture.
The basic elements of music transcription involve any of the following overlapping areas:
Search for papers on "music transcription" on Google Scholar or from the ISMIR proceedings: http://www.ismir.net. If you are more interested in one of the above subtopics, I can point you further. Good luck.
EDIT: That being said, there are existing solutions that we can all find on the web. Feel free to try them. But as you do, evaluate them with a critical eye and ear. What types of audio signals would cause transcription to fail?
EDIT 2: Ah, you are only doing this for piano. Okay, this is doable. Music transcription has advanced to the point where it can transcribe monophonic piano pretty well. A Rachmaninov concerto will still pose problems.
Our recommendations depend upon your end goal. You state "need to perform... in Java." So it sounds like you just want something to work regardless of how it gets you there. In that case, I agree 100% with others: use something that exists.
That's actually an interesting question; all of the MIR libraries I know are typically C/C++/Python/Matlab. But not Java. The EchoNest has a Java API, but I don't think it does note-level transcription. http://developer.echonest.com. (Edit: It does note-level transcription. The returned data includes pitch, timbre, beat, tatum, and more. But I find polyphony is still a problem.)
Oh, Marsyas is Java-based. Cool. I thought it was just C++. http://marsyas.info/ I recommend this. It's developed by George Tzanetakis, a professor in MIR. It does signal-level analysis and should be a good option.
Now, if this is for a fun learning experience, I think you can use the sound manipulation utilities in Java to experiment with the WAV signal and see what comes out.
EDIT: This page describes MIR software better than I can: The Tools We Use
For Matlab, you may be interested in the MIR Toolbox
Here is a nice page of common datasets: MIR Datasets
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With