Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speech recognition on iPhone 5

I am using the iOS speech recognition API from an Objective-C iOS app. It works on iPhone 6, 7, but does not work on iPhone 5 (iOS, 10.2.1).

Also note it works on iPhone 5s, just not iPhone 5.

Is the iOS speech API suppose to work on iPhone 5? Do you have to do anything different to get it to work, or does anyone know what the issue could be?

The basic code is below. No errors occur, and the mic volume is detected, but no speech is detected.

if (audioEngine != NULL) {
        [audioEngine stop];
        [speechTask cancel];
        AVAudioInputNode* inputNode = [audioEngine inputNode];
        [inputNode removeTapOnBus: 0];
    }

    recording = YES;
    micButton.selected = YES;

    //NSLog(@"Starting recording...   SFSpeechRecognizer Available? %d", [speechRecognizer isAvailable]);
    NSError * outError;
    //NSLog(@"AUDIO SESSION CATEGORY0: %@", [[AVAudioSession sharedInstance] category]);
    AVAudioSession* audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory: AVAudioSessionCategoryPlayAndRecord withOptions:AVAudioSessionCategoryOptionDefaultToSpeaker error:&outError];
    [audioSession setMode: AVAudioSessionModeMeasurement error:&outError];
    [audioSession setActive: true withOptions: AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&outError];

    SFSpeechAudioBufferRecognitionRequest* speechRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
    //NSLog(@"AUDIO SESSION CATEGORY1: %@", [[AVAudioSession sharedInstance] category]);
    if (speechRequest == nil) {
        NSLog(@"Unable to create SFSpeechAudioBufferRecognitionRequest.");
        return;
    }

    speechDetectionSamples = 0;

    // This some how fixes a crash on iPhone 7
    // Seems like a bug in iOS ARC/lack of gc
    AVAudioEngine* temp = audioEngine;
    audioEngine = [[AVAudioEngine alloc] init];
    AVAudioInputNode* inputNode = [audioEngine inputNode];

    speechRequest.shouldReportPartialResults = true;

    // iOS speech does not detect end of speech, so must track silence.
    lastSpeechDetected = -1;

    speechTask = [speechRecognizer recognitionTaskWithRequest: speechRequest delegate: self];

    [inputNode installTapOnBus:0 bufferSize: 4096 format: [inputNode outputFormatForBus:0] block:^(AVAudioPCMBuffer* buffer, AVAudioTime* when) {
        @try {
            long millis = [[NSDate date] timeIntervalSince1970] * 1000;
            if (lastSpeechDetected != -1 && ((millis - lastSpeechDetected) > 1000)) {
                lastSpeechDetected = -1;
                [speechTask finish];
                return;
            }
            [speechRequest appendAudioPCMBuffer: buffer];

            //Calculate volume level
            if ([buffer floatChannelData] != nil) {
                float volume = fabsf(*buffer.floatChannelData[0]);

                if (volume >= speechDetectionThreshold) {
                    speechDetectionSamples++;

                    if (speechDetectionSamples >= speechDetectionSamplesNeeded) {

                        //Need to change mic button image in main thread
                        [[NSOperationQueue mainQueue] addOperationWithBlock:^ {

                            [micButton setImage: [UIImage imageNamed: @"micRecording"] forState: UIControlStateSelected];

                        }];
                    }
                } else {
                    speechDetectionSamples = 0;
                }
            }
        }
        @catch (NSException * e) {
            NSLog(@"Exception: %@", e);
        }
    }];

    [audioEngine prepare];
    [audioEngine startAndReturnError: &outError];
    NSLog(@"Error %@", outError);
like image 504
James Avatar asked Jul 12 '17 12:07

James


1 Answers

I think the bug is here in this code:

long millis = [[NSDate date] timeIntervalSince1970] * 1000;

The 32-bit devices(iPhone 5 is a 32-bit device), can save maximum number upto 2^32-1 i.e 2,147,483,647.

I checked on iPhone 5 simulator, the millis is having a negative value. In the code snippet you have posted, there is no mention of how lastSpeechDetected is getting set after initially setting it to -1, but if somehow ((millis - lastSpeechDetected) > 1000) is true, then it will enter the if-block and finish the speech task.

like image 116
Puneet Sharma Avatar answered Oct 15 '22 08:10

Puneet Sharma