Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mixing multiple signals using audio units on the iOS

I'm making a controller for a synth on the iPad, and I have 4 or 5 signals that I want to mix and send to the remoteIO render callback . Right now I have two choices:

I can use the multichannel mixer unit but I don't know how it works internally. Does it simply add the buffers together and divide by the number of buffers? Because that would mean the volume for each would be greatly diminished.

I read http://www.vttoth.com/digimix.htm and http://atastypixel.com/blog/how-to-mix-audio-samples-properly-on-ios/ on the proper way to mix signals and am now thinking of doing the mixing manually in the remoteIO callback.

Any suggestions?

like image 900
anon Avatar asked Dec 27 '22 16:12

anon


1 Answers

As I've recently been through this very issue and have been meaning to anyway, I've written a detailed post about three of the options you have when manually mixing/summing audio in your render callback:

Mixing Audio without Clipping in iOS: Limiters and Other Techniques

Basically, the first method is outlined in the blog posts you've referenced. However as stated in A Tasty Pixel's post, this "averaging" technique introduces harmonic distorion. Apparently though, it's not noticeable in certain applications, like when the sources are full tracks or noise-based percussion (e.g. snare drums). According to A Tasty Pixel, Loopy app, a professional calibre audio tool, uses this technique successfully.

The second option, which is perhaps better if you have natural sounds and/or high polyphony, is simply to scale down the volume of the audio by pre-multiplying the audio data by 0 < A < 1. It might sound facetious but I believe for reasons outlined in my blog post that OpenAL does something similar to this based on how many sources you allocate. The scale factor needn't be as low as you might think. My app, Sound Wand, has samples normalised to full scale and a max polyphony of 20 and I use a pre-scale value of only about 1/3 (not 1/20). The upside is a very nice dynamic range in your instrument - i.e. soft notes are quiet and hard ones or lots of them together are much louder. This is often considered one of the hallmarks of a quality instrument. The downside is that it is a bit quiet at times on the iPhone/iPad's built in speaker and the dynamic range can be too much for cheap external amplifiers.

The third option is indeed the brickwall limiter but there is nothing simple about it. A regular brickwall limiter will NOT prevent clipping. You need a lookahead brickwall limiter and code for these is not readily available. The reason you need the lookahead is that the limiter needs time to smoothly decrease the volume up until the point that clipping first occurs. It is not enough to "reduce volume immediately" when the waveform begins to clip because reducing the volume to bring it back to 1.0 is the exact same thing that clipping does for you! As a result, non-lookahead limiters will only remove half the clip (and that only because of the release time).

Brickwall limiting without "lookahead" kicks in too late and only prevents half the clip

(Brickwall limiting without "lookahead" kicks in too late and only prevents half the clip)

There are drawbacks with lookahead limiters as well. They introduce there own harmonic distortion, though much less than clipping. Worse is that they increase the latency, the time between a user's action and the audio outcome, by the lookahead time. This means a less responsive app. As you might have guessed, a longer lookahead means a more transparent outcome (less distortion) so there is a tradeoff. I still believe this to be viable method and I outline it a bit at the bottom of the post.

One final note with regards to using the AU mixer versus manually mixing in remoteIO: I'd generally be more inclined to use the AU if my channels are like audio "tracks" and I want the usual volume/pan to be controlled by the user or if you have a game with a few channels perhaps for background/foreground/etc. If you have a lot of sounds, like notes on a keyboard, or if you want bespoke control over the summing (e.g. different pan laws, special rules about which sounds are on/off, etc) you might be better doing it manually. If you are just summing a couple audio tracks than really either option will do. If you go with option 2 above, doing it manually actually takes very fews lines of code once you've gotten your hands into the Accelerate Framework's vDSP functions.

like image 189
Hari Honor Avatar answered Feb 15 '23 16:02

Hari Honor