I have been using NSDataDetector
to parse address out of strings and for the most part it does a good job. However on address' similar to this one it does not detect it.
6200 North Evan Blvd Suit 487 Highland UT 84043
Currently I am using this code:
NSError *error = nil;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeAddress error:&error];
NSArray *matches = [detector matchesInString:output options:0 range:NSMakeRange(0, [output length])];
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypeAddress) {
_address = [_tesseractData substringWithRange:[match range]];
NSDictionary *data = [match addressComponents];
_zip = [data objectForKey:@"ZIP"];
if (_zip) {
NSRange zipRange = [_tesseractData rangeOfString:_zip];
if (zipRange.location != NSNotFound) {
[_tesseractData deleteCharactersInRange:zipRange];
}
}
_city = [data objectForKey:@"City"];
if (_city) {
NSRange cityRange = [_tesseractData rangeOfString:[_city uppercaseString]];
if (cityRange.location != NSNotFound) {
[_tesseractData deleteCharactersInRange:cityRange];
}
}
_city = [_city capitalizedString];
_state = [data objectForKey:@"State"];
_street = [data objectForKey:@"Street"];
if (_street) {
NSRange streetRange = [_tesseractData rangeOfString:[_street uppercaseString]];
if (streetRange.location != NSNotFound) {
[_tesseractData deleteCharactersInRange:streetRange];
}
}
_street = [_street capitalizedString];
}
}
Can anyone suggest a more robust method for parsing out the physical address out of a string? I need to be able to get the Zip, Street, State and City.
A NSDataDetector
is a NSRegularExpression
subclass, so maybe you could create a customized instance and start by checking what Apple puts as pattern
and options
parameters.
Something along this lines:
NSDataDetector * dataDetectorRegEx = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeAddress error:&error];
NSString * dataDetectorPattern = dataDetectorRegEx.pattern;
NSLog(@"Check out this pattern!: %@", dataDetectorPattern);
// Customize the pattern for your special cases
NSString * customPattern = [NSString stringWithFormat:@"<MY_OTHER_PATERNS + %@>", dataDetectorPattern];
NSRegularExpression * customDataDetectorLikeRegEx = [NSRegularExpression regularExpressionWithPattern:customPattern options:someOptions error:&error];
You can try parse the address information with regular expressions (RegEx), I think that is more robust way. See the following reference to work with RegEx: Making RegEx Easy in Objective-C, Objective-C RegEx Categories is available on GitHub.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With