I am trying to write a method:
- (NSDictionary *)wordFrequencyFromString:(NSString *)string {}
where the dictionary returned will have the words and how often they were used in the string provided. Unfortunately, I can't seem to find a way to iterate through words in a string to analyze each one - only each character which seems like a bit more work than necessary. Any suggestions?
NSString has -enumerateSubstringsInRange:
method which allows to enumerate all words directly, letting standard api to do all necessary stuff to define word boundaries etc:
[s enumerateSubstringsInRange:NSMakeRange(0, [s length])
options:NSStringEnumerationByWords
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(@"%@", substring);
}];
In the enumeration block you can use either NSDictionary with words as keys and NSNumber as their counts, or use NSCountedSet that provides required functionality for counts.
You can use componentsSeparatedByCharactersInSet:
to split the string and NSCountedSet
will count the words for you.
1) Split the string into words using a combination of the punctuation, whitespace and new line character sets:
NSMutableCharacterSet *separators = [NSMutableCharacterSet punctuationCharacterSet];
[separators formUnionWithCharacterSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
NSArray *words = [myString componentsSeparatedByCharactersInSet:separators];
2) Count the occurrences of the words (if you want to disregard capitalization, you can do NSString *myString = [originalString lowercaseString];
before splitting the string into components):
NSCountedSet *frequencies = [NSCountedSet setWithArray:words];
NSUInteger aWordCount = [frequencies countForObject:@"word"]);
If you are willing to change your method signature, you can just return the counted set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With