I've found lots of examples online for working with audio in iOS, but most of them are pretty outdated and don't apply to what I'm trying to accomplish. Here's my project:
I need to capture audio samples from two sources - microphone input and stored audio files. I need to perform FFT on these samples to produce a "fingerprint" for the entire clip, as well as apply some additional filters. The ultimate goal is to build a sort of song-recognition software similar to Shazam, etc.
What is the best way to capture the individual audio samples in iOS 8 for performing a Fast Fourier Transform? I imagine ending up with a large array of them, but I suspect that it might not work quite like that. Secondly, how can I use the Accelerate framework for processing the audio? It seems to be the most efficient way to perform complex analysis on audio in iOS.
All the examples I've seen online are using older versions of iOS and Objective-C, and I haven't been able to successfully translate them into Swift. Does iOS 8 provide some new frameworks for this sort of thing?
AVAudioEngine is the way to go for this. From Apple's docs:
- For playback and recording of a single track, use AVAudioPlayer and AVAudioRecorder.
- For more complex audio processing, use AVAudioEngine. AVAudioEngine includes AVAudioInputNode and AVAudioOutputNode for audio input and output. You can also use AVAudioNode objects for processing and mixing effects into your audio
I'll be straight with you: AVAudioEngine is an extremely finicky API with vague documentation, rarely-helpful error messaging, and almost no online code examples demonstrating more than the most basic tasks. BUT if you take the time to get over the small learning curve, you can really do some magical things with it relatively easily.
I've built a simple "playground" view controller that demonstrates both microphone and audio file sampling working in tandem:
import UIKit
class AudioEnginePlaygroundViewController: UIViewController {
private var audioEngine: AVAudioEngine!
private var mic: AVAudioInputNode!
private var micTapped = false
override func viewDidLoad() {
super.viewDidLoad()
configureAudioSession()
audioEngine = AVAudioEngine()
mic = audioEngine.inputNode!
}
static func getController() -> AudioEnginePlaygroundViewController {
let me = AudioEnginePlaygroundViewController(nibName: "AudioEnginePlaygroundViewController", bundle: nil)
return me
}
@IBAction func toggleMicTap(_ sender: Any) {
if micTapped {
mic.removeTap(onBus: 0)
micTapped = false
return
}
let micFormat = mic.inputFormat(forBus: 0)
mic.installTap(onBus: 0, bufferSize: 2048, format: micFormat) { (buffer, when) in
let sampleData = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
}
micTapped = true
startEngine()
}
@IBAction func playAudioFile(_ sender: Any) {
stopAudioPlayback()
let playerNode = AVAudioPlayerNode()
let audioUrl = Bundle.main.url(forResource: "test_audio", withExtension: "wav")!
let audioFile = readableAudioFileFrom(url: audioUrl)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.outputNode, format: audioFile.processingFormat)
startEngine()
playerNode.scheduleFile(audioFile, at: nil) {
playerNode .removeTap(onBus: 0)
}
playerNode.installTap(onBus: 0, bufferSize: 4096, format: playerNode.outputFormat(forBus: 0)) { (buffer, when) in
let sampleData = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
}
playerNode.play()
}
// MARK: Internal Methods
private func configureAudioSession() {
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers, .defaultToSpeaker])
try AVAudioSession.sharedInstance().setActive(true)
} catch { }
}
private func readableAudioFileFrom(url: URL) -> AVAudioFile {
var audioFile: AVAudioFile!
do {
try audioFile = AVAudioFile(forReading: url)
} catch { }
return audioFile
}
private func startEngine() {
guard !audioEngine.isRunning else {
return
}
do {
try audioEngine.start()
} catch { }
}
private func stopAudioPlayback() {
audioEngine.stop()
audioEngine.reset()
}
}
The audio samples are given to you via installTap's completion handler which is continuously called as audio passes through the tapped node (either the mic or the audio file player) in real time. You can access individual samples by indexing the sampleData pointer that I've created in each block.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With