How to implement speech-to-text via the Speech framework in Objective-C?

Video Answer

1 Answers

After spending enough time looking for Objective-C samples -even in the Apple documentation- I couldn't find anything decent, so I figured it out myself.

Header file (.h)

/*!
 * Import the Speech framework, assign the Delegate and declare variables
 */

#import <Speech/Speech.h>

@interface ViewController : UIViewController <SFSpeechRecognizerDelegate> {
    SFSpeechRecognizer *speechRecognizer;
    SFSpeechAudioBufferRecognitionRequest *recognitionRequest;
    SFSpeechRecognitionTask *recognitionTask;
    AVAudioEngine *audioEngine;
}

Methods file (.m)

- (void)viewDidLoad {
    [super viewDidLoad];

    // Initialize the Speech Recognizer with the locale, couldn't find a list of locales
    // but I assume it's standard UTF-8 https://wiki.archlinux.org/index.php/locale
    speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:[[NSLocale alloc] initWithLocaleIdentifier:@"en_US"]];

    // Set speech recognizer delegate
    speechRecognizer.delegate = self;

    // Request the authorization to make sure the user is asked for permission so you can
    // get an authorized response, also remember to change the .plist file, check the repo's
    // readme file or this project's info.plist
    [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) {
        switch (status) {
            case SFSpeechRecognizerAuthorizationStatusAuthorized:
                NSLog(@"Authorized");
                break;
            case SFSpeechRecognizerAuthorizationStatusDenied:
                NSLog(@"Denied");
                break;
            case SFSpeechRecognizerAuthorizationStatusNotDetermined:
                NSLog(@"Not Determined");
                break;
            case SFSpeechRecognizerAuthorizationStatusRestricted:
                NSLog(@"Restricted");
                break;
            default:
                break;
        }
    }];

}

/*!
 * @brief Starts listening and recognizing user input through the 
 * phone's microphone
 */

- (void)startListening {

    // Initialize the AVAudioEngine
    audioEngine = [[AVAudioEngine alloc] init];

    // Make sure there's not a recognition task already running
    if (recognitionTask) {
        [recognitionTask cancel];
        recognitionTask = nil;
    }

    // Starts an AVAudio Session
    NSError *error;
    AVAudioSession *audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory:AVAudioSessionCategoryRecord error:&error];
    [audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error];

    // Starts a recognition process, in the block it logs the input or stops the audio
    // process if there's an error.
    recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
    AVAudioInputNode *inputNode = audioEngine.inputNode;
    recognitionRequest.shouldReportPartialResults = YES;
    recognitionTask = [speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {
        BOOL isFinal = NO;
        if (result) {
            // Whatever you say in the microphone after pressing the button should be being logged
            // in the console.
            NSLog(@"RESULT:%@",result.bestTranscription.formattedString);
            isFinal = !result.isFinal;
        }
        if (error) {
            [audioEngine stop];
            [inputNode removeTapOnBus:0];
            recognitionRequest = nil;
            recognitionTask = nil;
        }
    }];

    // Sets the recording format
    AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];
    [inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
        [recognitionRequest appendAudioPCMBuffer:buffer];
    }];

    // Starts the audio engine, i.e. it starts listening.
    [audioEngine prepare];
    [audioEngine startAndReturnError:&error];
    NSLog(@"Say Something, I'm listening"); 
}

- (IBAction)microPhoneTapped:(id)sender {
    if (audioEngine.isRunning) {
        [audioEngine stop];
        [recognitionRequest endAudio];
    } else {
        [self startListening];
    }
}

Now, add the delegate the SFSpeechRecognizerDelegate to check if the speech recognizer is available.

#pragma mark - SFSpeechRecognizerDelegate Delegate Methods

- (void)speechRecognizer:(SFSpeechRecognizer *)speechRecognizer availabilityDidChange:(BOOL)available {
    NSLog(@"Availability:%d",available);
}

Instructions & Notes

Remember to modify the .plist file to get user's authorization for Speech Recognition and using the microphone, of course the <String> value must be customized to your needs, you can do this by creating and modifying the values in the Property List or right-click on the .plist file and Open As -> Source Code and paste the following lines before the </dict> tag.

<key>NSMicrophoneUsageDescription</key>  <string>This app uses your microphone to record what you say, so watch what you say!</string>

<key>NSSpeechRecognitionUsageDescription</key>  <string>This app uses Speech recognition to transform your spoken words into text and then analyze the, so watch what you say!.</string>

Also remember that in order to be able to import the Speech framework into the project you need to have iOS 10.0+.

To get this running and test it you just need a very basic UI, just create an UIButton and assign the microPhoneTapped action to it, when pressed the app should start listening and logging everything that it hears through the microphone to the console (in the sample code NSLog is the only thing receiving the text). It should stop the recording when pressed again.

I created a Github repo with a sample project, enjoy!

153

answered Oct 21 '22 02:10

Boris

Related questions
                            
                                Reverse geocoding in Swift 4
                            
                                How to access a specific field from Cloud FireStore Firebase in Swift
                            
                                highlighting uicollectionview cell on tap
                            
                                Where is redeem code for public link for TestFlight?
                            
                                Add dependency on a local swift package in Xcode 11
                            
                                Where to manage Build Configurations in Xcode 4?
                            
                                What's the difference between GLKView and EAGLView?
                            
                                PhoneGap 2.0 iOS getting started guide buggy?
                            
                                understanding xCode schemes
                            
                                move existing web app into a native phone app using a browser wrapper
                            
                                iOS Push Notifications - update badge without alert?
                            
                                UINavigationController bar covers its uiviewcontroller's content
                            
                                Toolbar item not showing in xamarin forms
                            
                                What does `self.view.layoutIfNeeded()` do when changing constraints
                            
                                Repeating local notification daily at a set time with swift
                            
                                HTTPS request in iOS 9 : NSURLSession/NSURLConnection HTTP load failed (kCFStreamErrorDomainSSL, -9802)
                            
                                Convert String to NSDate with Swift 2
                            
                                How can I choose Swift compiler version
                            
                                UIStackView - Want a percentage to define each item
                            
                                Disable auto fullscreen of YouTube embeds on iPhone

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to implement speech-to-text via the Speech framework in Objective-C?

Tags:

ios

objective-c

mobile-application

speech-recognition

speech-to-text

Boris

People also ask

Video Answer

1 Answers

Header file (.h)

Methods file (.m)

Instructions & Notes

Boris

Recent Activity

Donate For Us