I am trying to parse a string and get another string in the middle.
ie.
Hello world this is a string
I need to find the string between "world" and "is" (this). I have looked around but haven't been able to figure it out yet, mainly because I am new to Objective C... Anyone have an idea of how to do this, with RegEx or without?
The regular expressions solution that Jacques gives works, and the caveat of requiring iOS 4.0 and later is true. Using regular expressions is also quite slow, and an overkill if the search expressions are known string constants.
You can solve the problem using methods on NSString
, or a class named NSScanner
, both have been available since iPhone OS 2.0 and long before that, since before Mac OS X 10.0 actually :).
So what you want is a new method on NSString
like this?
@interface NSString (CWAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end;
@end
No problem, and we assume we should return nil
is no such strings could be found.
The implementation using NSString
only is quite straight forward:
@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
NSRange startRange = [self rangeOfString:start];
if (startRange.location != NSNotFound) {
NSRange targetRange;
targetRange.location = startRange.location + startRange.length;
targetRange.length = [self length] - targetRange.location;
NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
if (endRange.location != NSNotFound) {
targetRange.length = endRange.location - targetRange.location;
return [self substringWithRange:targetRange];
}
}
return nil;
}
@end
Or you could do the implementation using the NSScanner
class:
@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
NSScanner* scanner = [NSScanner scannerWithString:self];
[scanner setCharactersToBeSkipped:nil];
[scanner scanUpToString:start intoString:NULL];
if ([scanner scanString:start intoString:NULL]) {
NSString* result = nil;
if ([scanner scanUpToString:end intoString:&result]) {
return result;
}
}
return nil;
}
@end
Just a simple modification to PeyloW's answer, that returns all strings within the start and end strings:
-(NSMutableArray*)stringsBetweenString:(NSString*)start andString:(NSString*)end
{
NSMutableArray* strings = [NSMutableArray arrayWithCapacity:0];
NSRange startRange = [self rangeOfString:start];
for( ;; )
{
if (startRange.location != NSNotFound)
{
NSRange targetRange;
targetRange.location = startRange.location + startRange.length;
targetRange.length = [self length] - targetRange.location;
NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
if (endRange.location != NSNotFound)
{
targetRange.length = endRange.location - targetRange.location;
[strings addObject:[self substringWithRange:targetRange]];
NSRange restOfString;
restOfString.location = endRange.location + endRange.length;
restOfString.length = [self length] - restOfString.location;
startRange = [self rangeOfString:start options:0 range:restOfString];
}
else
{
break;
}
}
else
{
break;
}
}
return strings;
}
See the ICU user guide on regular expressions.
If you know there'll just be one result:
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:@"\bworld\s+(.+)\s+is\b" options:0 error:NULL]
NSTextCheckingResult *result = [regex firstMatchInString:string
options:0 range:NSMakeRange(0, [string length]];
// Gets the string inside the first set of parentheses in the regex
NSString *inside = [string substringWithRange:[result rangeAtIndex:1]];
The \b makes sure there's a word boundary before world and after is (so "hello world this isn't a string" wouldn't match). The \s gobbles up any whitespace after world and before is. The .+? finds what you're looking for, with the ? making it non-greedy so that "hello world this is a string hello world this is a string" doesn't give you "this a string hello world this".
I'll leave it up to you to figure out how to handle multiple matches. The NSRegularExpression documentation should help you out.
If you want to make sure the match doesn't cross sentence boundaries, you could do ([^.]+?) instead of (.+?), or you could use enumerateSubstringsInRange:options:usingBlock: on your string and pass NSStringEnumerationBySentences in the options.
This stuff all needs 4.0+. If you want to support 3.0+, look into RegexKitLite.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With