Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Analyzing audio to create Guitar Hero levels automatically

I'm trying to create a Guitar-Hero-like game (something like this) and I want to be able to analyze an audio file given by the user and create levels automatically, but I am not sure how to do that.

I thought maybe I should use BPM detection algorithm and place an arrow on a beat and a rail on some recurrent pattern, but I have no idea how to implement those.

Also, I'm using NAudio's BlockAlignReductionStream which has a Read method that copys byte[] data, but what happens when I read a 2-channels audio file? does it read 1 byte from the first channel and 1 byte from the second? (because it says 16-bit PCM) and does the same happen with 24-bit and 32-bit float?

like image 365
Symbol Avatar asked Nov 20 '11 07:11

Symbol


1 Answers

Beat detection (or more specifically BPM detection)

Beat detection algorithm overview for using a comb filter:

  • http://www.clear.rice.edu/elec301/Projects01/beat_sync/beatalgo.html

Looks like they do:

  • A fast Fourier transform
  • Hanning Window, full-wave rectification
  • Multiple low pass filters; one for each range of the FFT output
  • Differentiation and half-wave rectification
  • Comb filter

Lots of algorithms you'll have to implement here. Comb filters are supposedly slow, though. The wiki article didn't point me at other specific methods.

Edit: This article has information on streaming statistical methods of beat detection. That sounds like a great idea: http://www.flipcode.com/misc/BeatDetectionAlgorithms.pdf - I'm betting they run better in real time, though are less accurate.

BTW I just skimmed and pulled out keywords. I've only toyed with FFT, rectification, and attenuation filters (low-pass filter). The rest I have no clue about, but you've got links.

This will all get you the BPM of the song, but it won't generate your arrows for you.

Level generation

As for "place an arrow on a beat and a rail on some recurrent pattern", that is going to be a bit trickier to implement to get good results.

You could go with a more aggressive content extraction approach, and try to pull the notes out of the song.

You'd need to use beat detection for this part too. This may be similar to BPM detection above, but at a different range, with a band-pass filter for the instrument range. You also would swap out or remove some parts of the algorithm, and would have to sample the whole song since you're not detecting a global BPM. You'd also need some sort of pitch detection.

I think this approach will be messy and will guarantee you need to hand-scrub the results for every song. If you're okay with this, and just want to avoid the initial hand transcription work, this will probably work well.

You could also try to go with a content generation approach.

Most procedural content generation has been done in a trial-and-error manner, with people publishing or patenting algorithms that don't completely suck. Often there is no real qualitative analysis that can be done on content generation algorithms because they generate aesthetics. So you'd just have to pick ones that seem to give pleasing sample results and try it out.

Most algorithms are centered around visual content generation, including terrain, architecture, humanoids, plants etc. There is some research on audio content generation, Generative Music, etc. Your requirements don't perfectly match either of these.

I think algorithms for procedural "dance steps" (if such a thing exists - I only found animation techniques) or Generative Music would be the closest match, if driven by the rhythms you detect in the song.

If you want to go down the composition generation approach, be prepared for a lot of completely different algorithms that are usually just hinted about, but not explained in detail.

E.g.:

  • http://tones.wolfram.com/about/faqs/howitworks.html
  • http://research.microsoft.com/en-us/um/redmond/projects/songsmith/
like image 76
Merlyn Morgan-Graham Avatar answered Sep 21 '22 13:09

Merlyn Morgan-Graham