Word Stemming in iOS - Not working for single word

Question

I am using NSLinguisticTagger for word stemming. I am able to get a stem words of words in a sentence, but not able to get a stem word for a single word.

Following is the code I am using,

    NSString *stmnt = @"i waited";
    NSLinguisticTaggerOptions options = NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation | NSLinguisticTaggerJoinNames;

    NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:@[NSLinguisticTagSchemeLemma] options:options];
    tagger.string = stmnt;
    [tagger enumerateTagsInRange:NSMakeRange(0, [stmnt length]) scheme:NSLinguisticTagSchemeLemma options:options usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
        NSString *token = [stmnt substringWithRange:tokenRange];
        NSLog(@"%@: %@", token, tag);
    }];

For this I am getting out correctly as:

i: i
waited: wait

But the above code fails to identify stem word if stmnt = @"waited";

Any help is greatly appreciated

Ab'initio · Accepted Answer

Following code worked for me,

NSString *stmt = @"waited";
NSRange stringRange = NSMakeRange(0, stmt.length);
NSDictionary* languageMap = @{@"Latn" : @[@"en"]};
[stmt enumerateLinguisticTagsInRange:stringRange
                                       scheme:NSLinguisticTagSchemeLemma
                                      options:NSLinguisticTaggerOmitWhitespace
                                  orthography:[NSOrthography orthographyWithDominantScript:@"Latn" languageMap:languageMap]
                                   usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
                                       // Log info to console for debugging purposes
                                       NSString *currentEntity = [stmt substringWithRange:tokenRange];
                                       NSLog(@"%@ is a %@, tokenRange (%d,%d)",currentEntity,tag,tokenRange.length,tokenRange.location);
                                   }];

Craig Grummitt · Answer

The accepted answer converted to Swift for those who need it:

    let stmt = "waited"
    let options: NSLinguisticTaggerOptions = .OmitWhitespace
    let stringRange = NSMakeRange(0, stmt.length)
    let languageMap = ["Latn":["en"]]
    let orthography = NSOrthography(dominantScript: "Latn", languageMap: languageMap)

    stmt.enumerateLinguisticTagsInRange(
        stringRange,
        scheme: NSLinguisticTagSchemeLemma,
        options: options,
        orthography: orthography)
        { (tag, tokenRange, sentenceRange, _) -> () in
            let currentEntity = stmt.substringWithRange(tokenRange)
            println(">\(currentEntity):\(tag)")
    }

Vojto · Answer

It doesn't work for single word, because there isn't enough information to determine its role in the sentence.

In our case, when user enters single word into our natural language parser, we assume it's a name of a thing, and thus a noun.

So we just construct a sentence where it's implied that the entered word is a noun like so:

let str = "please show me \(word)"

Then just run it through NSLinguisticTagger as usual.

Word Stemming in iOS - Not working for single word

Tags:

ios

objective-c

iphone

linguistics

Ab'initio

3 Answers

Ab'initio

Craig Grummitt

Vojto

Recent Activity

Donate For Us

Word Stemming in iOS - Not working for single word

Tags:

ios

objective-c

iphone

linguistics

Ab'initio

3 Answers

Ab'initio

Craig Grummitt

Vojto

Related questions

Recent Activity

Donate For Us