I want to do speech recognition in my Objective-C app using the iOS Speech framework.
I found some Swift examples but haven't been able to find anything in Objective-C.
Is it possible to access this framework from Objective-C? If so, how?
SUMMARY. Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives.
After spending enough time looking for Objective-C samples -even in the Apple documentation- I couldn't find anything decent, so I figured it out myself.
/*!
* Import the Speech framework, assign the Delegate and declare variables
*/
#import <Speech/Speech.h>
@interface ViewController : UIViewController <SFSpeechRecognizerDelegate> {
SFSpeechRecognizer *speechRecognizer;
SFSpeechAudioBufferRecognitionRequest *recognitionRequest;
SFSpeechRecognitionTask *recognitionTask;
AVAudioEngine *audioEngine;
}
- (void)viewDidLoad {
[super viewDidLoad];
// Initialize the Speech Recognizer with the locale, couldn't find a list of locales
// but I assume it's standard UTF-8 https://wiki.archlinux.org/index.php/locale
speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:[[NSLocale alloc] initWithLocaleIdentifier:@"en_US"]];
// Set speech recognizer delegate
speechRecognizer.delegate = self;
// Request the authorization to make sure the user is asked for permission so you can
// get an authorized response, also remember to change the .plist file, check the repo's
// readme file or this project's info.plist
[SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) {
switch (status) {
case SFSpeechRecognizerAuthorizationStatusAuthorized:
NSLog(@"Authorized");
break;
case SFSpeechRecognizerAuthorizationStatusDenied:
NSLog(@"Denied");
break;
case SFSpeechRecognizerAuthorizationStatusNotDetermined:
NSLog(@"Not Determined");
break;
case SFSpeechRecognizerAuthorizationStatusRestricted:
NSLog(@"Restricted");
break;
default:
break;
}
}];
}
/*!
* @brief Starts listening and recognizing user input through the
* phone's microphone
*/
- (void)startListening {
// Initialize the AVAudioEngine
audioEngine = [[AVAudioEngine alloc] init];
// Make sure there's not a recognition task already running
if (recognitionTask) {
[recognitionTask cancel];
recognitionTask = nil;
}
// Starts an AVAudio Session
NSError *error;
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setCategory:AVAudioSessionCategoryRecord error:&error];
[audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error];
// Starts a recognition process, in the block it logs the input or stops the audio
// process if there's an error.
recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
AVAudioInputNode *inputNode = audioEngine.inputNode;
recognitionRequest.shouldReportPartialResults = YES;
recognitionTask = [speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {
BOOL isFinal = NO;
if (result) {
// Whatever you say in the microphone after pressing the button should be being logged
// in the console.
NSLog(@"RESULT:%@",result.bestTranscription.formattedString);
isFinal = !result.isFinal;
}
if (error) {
[audioEngine stop];
[inputNode removeTapOnBus:0];
recognitionRequest = nil;
recognitionTask = nil;
}
}];
// Sets the recording format
AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];
[inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
[recognitionRequest appendAudioPCMBuffer:buffer];
}];
// Starts the audio engine, i.e. it starts listening.
[audioEngine prepare];
[audioEngine startAndReturnError:&error];
NSLog(@"Say Something, I'm listening");
}
- (IBAction)microPhoneTapped:(id)sender {
if (audioEngine.isRunning) {
[audioEngine stop];
[recognitionRequest endAudio];
} else {
[self startListening];
}
}
Now, add the delegate the SFSpeechRecognizerDelegate
to check if the speech recognizer is available.
#pragma mark - SFSpeechRecognizerDelegate Delegate Methods
- (void)speechRecognizer:(SFSpeechRecognizer *)speechRecognizer availabilityDidChange:(BOOL)available {
NSLog(@"Availability:%d",available);
}
Remember to modify the .plist file to get user's authorization for Speech Recognition and using the microphone, of course the <String>
value must be customized to your needs, you can do this by creating and modifying the values in the Property List
or right-click on the .plist
file and Open As
-> Source Code
and paste the following lines before the </dict>
tag.
<key>NSMicrophoneUsageDescription</key> <string>This app uses your microphone to record what you say, so watch what you say!</string>
<key>NSSpeechRecognitionUsageDescription</key> <string>This app uses Speech recognition to transform your spoken words into text and then analyze the, so watch what you say!.</string>
Also remember that in order to be able to import the Speech framework into the project you need to have iOS 10.0+.
To get this running and test it you just need a very basic UI, just create an UIButton and assign the microPhoneTapped
action to it, when pressed the app should start listening and logging everything that it hears through the microphone to the console (in the sample code NSLog
is the only thing receiving the text). It should stop the recording when pressed again.
I created a Github repo with a sample project, enjoy!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With