I am using NSLinguisticTagger
for word stemming. I am able to get a stem words of words in a sentence, but not able to get a stem word for a single word.
Following is the code I am using,
NSString *stmnt = @"i waited";
NSLinguisticTaggerOptions options = NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation | NSLinguisticTaggerJoinNames;
NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:@[NSLinguisticTagSchemeLemma] options:options];
tagger.string = stmnt;
[tagger enumerateTagsInRange:NSMakeRange(0, [stmnt length]) scheme:NSLinguisticTagSchemeLemma options:options usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
NSString *token = [stmnt substringWithRange:tokenRange];
NSLog(@"%@: %@", token, tag);
}];
For this I am getting out correctly as:
i: i
waited: wait
But the above code fails to identify stem word if stmnt = @"waited";
Any help is greatly appreciated
Following code worked for me,
NSString *stmt = @"waited";
NSRange stringRange = NSMakeRange(0, stmt.length);
NSDictionary* languageMap = @{@"Latn" : @[@"en"]};
[stmt enumerateLinguisticTagsInRange:stringRange
scheme:NSLinguisticTagSchemeLemma
options:NSLinguisticTaggerOmitWhitespace
orthography:[NSOrthography orthographyWithDominantScript:@"Latn" languageMap:languageMap]
usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
// Log info to console for debugging purposes
NSString *currentEntity = [stmt substringWithRange:tokenRange];
NSLog(@"%@ is a %@, tokenRange (%d,%d)",currentEntity,tag,tokenRange.length,tokenRange.location);
}];
The accepted answer converted to Swift for those who need it:
let stmt = "waited"
let options: NSLinguisticTaggerOptions = .OmitWhitespace
let stringRange = NSMakeRange(0, stmt.length)
let languageMap = ["Latn":["en"]]
let orthography = NSOrthography(dominantScript: "Latn", languageMap: languageMap)
stmt.enumerateLinguisticTagsInRange(
stringRange,
scheme: NSLinguisticTagSchemeLemma,
options: options,
orthography: orthography)
{ (tag, tokenRange, sentenceRange, _) -> () in
let currentEntity = stmt.substringWithRange(tokenRange)
println(">\(currentEntity):\(tag)")
}
It doesn't work for single word, because there isn't enough information to determine its role in the sentence.
In our case, when user enters single word into our natural language parser, we assume it's a name of a thing, and thus a noun.
So we just construct a sentence where it's implied that the entered word is a noun like so:
let str = "please show me \(word)"
Then just run it through NSLinguisticTagger
as usual.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With