How to capture audio samples in iOS with Swift?

Question

I've found lots of examples online for working with audio in iOS, but most of them are pretty outdated and don't apply to what I'm trying to accomplish. Here's my project:

I need to capture audio samples from two sources - microphone input and stored audio files. I need to perform FFT on these samples to produce a "fingerprint" for the entire clip, as well as apply some additional filters. The ultimate goal is to build a sort of song-recognition software similar to Shazam, etc.

What is the best way to capture the individual audio samples in iOS 8 for performing a Fast Fourier Transform? I imagine ending up with a large array of them, but I suspect that it might not work quite like that. Secondly, how can I use the Accelerate framework for processing the audio? It seems to be the most efficient way to perform complex analysis on audio in iOS.

All the examples I've seen online are using older versions of iOS and Objective-C, and I haven't been able to successfully translate them into Swift. Does iOS 8 provide some new frameworks for this sort of thing?

WongWray · Accepted Answer

AVAudioEngine is the way to go for this. From Apple's docs:

For playback and recording of a single track, use AVAudioPlayer and AVAudioRecorder.

For more complex audio processing, use AVAudioEngine. AVAudioEngine includes AVAudioInputNode and AVAudioOutputNode for audio input and output. You can also use AVAudioNode objects for processing and mixing effects into your audio

I'll be straight with you: AVAudioEngine is an extremely finicky API with vague documentation, rarely-helpful error messaging, and almost no online code examples demonstrating more than the most basic tasks. BUT if you take the time to get over the small learning curve, you can really do some magical things with it relatively easily.

I've built a simple "playground" view controller that demonstrates both microphone and audio file sampling working in tandem:

import UIKit

class AudioEnginePlaygroundViewController: UIViewController {
    private var audioEngine: AVAudioEngine!
    private var mic: AVAudioInputNode!
    private var micTapped = false
    override func viewDidLoad() {
        super.viewDidLoad()
        configureAudioSession()
        audioEngine = AVAudioEngine()
        mic = audioEngine.inputNode!
    }

    static func getController() -> AudioEnginePlaygroundViewController {
        let me = AudioEnginePlaygroundViewController(nibName: "AudioEnginePlaygroundViewController", bundle: nil)
        return me
    }

    @IBAction func toggleMicTap(_ sender: Any) {
        if micTapped {
            mic.removeTap(onBus: 0)
            micTapped = false
            return
        }

        let micFormat = mic.inputFormat(forBus: 0)
        mic.installTap(onBus: 0, bufferSize: 2048, format: micFormat) { (buffer, when) in
            let sampleData = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
        }
        micTapped = true
        startEngine()
    }

    @IBAction func playAudioFile(_ sender: Any) {
        stopAudioPlayback()
        let playerNode = AVAudioPlayerNode()

        let audioUrl = Bundle.main.url(forResource: "test_audio", withExtension: "wav")!
        let audioFile = readableAudioFileFrom(url: audioUrl)
        audioEngine.attach(playerNode)
        audioEngine.connect(playerNode, to: audioEngine.outputNode, format: audioFile.processingFormat)
        startEngine()

        playerNode.scheduleFile(audioFile, at: nil) {
            playerNode .removeTap(onBus: 0)
        }
        playerNode.installTap(onBus: 0, bufferSize: 4096, format: playerNode.outputFormat(forBus: 0)) { (buffer, when) in
            let sampleData = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
        }
        playerNode.play()
    }

    // MARK: Internal Methods

    private func configureAudioSession() {
        do {
            try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers, .defaultToSpeaker])
            try AVAudioSession.sharedInstance().setActive(true)
        } catch { }
    }

    private func readableAudioFileFrom(url: URL) -> AVAudioFile {
        var audioFile: AVAudioFile!
        do {
            try audioFile = AVAudioFile(forReading: url)
        } catch { }
        return audioFile
    }

    private func startEngine() {
        guard !audioEngine.isRunning else {
            return
        }

        do {
            try audioEngine.start()
        } catch { }
    }

    private func stopAudioPlayback() {
        audioEngine.stop()
        audioEngine.reset()
    }
}

The audio samples are given to you via installTap's completion handler which is continuously called as audio passes through the tapped node (either the mic or the audio file player) in real time. You can access individual samples by indexing the sampleData pointer that I've created in each block.

How to capture audio samples in iOS with Swift?

Tags:

ios

swift

signal-processing

audio

fft

Hundley

1 Answers

WongWray

Recent Activity

Donate For Us

How to capture audio samples in iOS with Swift?

Tags:

ios

swift

signal-processing

audio

fft

Hundley

1 Answers

WongWray

Related questions

Recent Activity

Donate For Us