Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get String Between Two Other Strings in ObjC

I am trying to parse a string and get another string in the middle.

ie.

Hello world this is a string

I need to find the string between "world" and "is" (this). I have looked around but haven't been able to figure it out yet, mainly because I am new to Objective C... Anyone have an idea of how to do this, with RegEx or without?

like image 202
Andrew M Avatar asked Oct 23 '10 03:10

Andrew M


3 Answers

The regular expressions solution that Jacques gives works, and the caveat of requiring iOS 4.0 and later is true. Using regular expressions is also quite slow, and an overkill if the search expressions are known string constants.

You can solve the problem using methods on NSString, or a class named NSScanner, both have been available since iPhone OS 2.0 and long before that, since before Mac OS X 10.0 actually :).

So what you want is a new method on NSString like this?

@interface NSString (CWAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end;
@end

No problem, and we assume we should return nil is no such strings could be found.

The implementation using NSString only is quite straight forward:

@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
    NSRange startRange = [self rangeOfString:start];
    if (startRange.location != NSNotFound) {
        NSRange targetRange;
        targetRange.location = startRange.location + startRange.length;
        targetRange.length = [self length] - targetRange.location;   
        NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
        if (endRange.location != NSNotFound) {
           targetRange.length = endRange.location - targetRange.location;
           return [self substringWithRange:targetRange];
        }
    }
    return nil;
}
@end

Or you could do the implementation using the NSScanner class:

@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
    NSScanner* scanner = [NSScanner scannerWithString:self];
    [scanner setCharactersToBeSkipped:nil];
    [scanner scanUpToString:start intoString:NULL];
    if ([scanner scanString:start intoString:NULL]) {
        NSString* result = nil;
        if ([scanner scanUpToString:end intoString:&result]) {
            return result;
        }
    }
    return nil;
}
@end
like image 170
PeyloW Avatar answered Nov 11 '22 10:11

PeyloW


Just a simple modification to PeyloW's answer, that returns all strings within the start and end strings:

-(NSMutableArray*)stringsBetweenString:(NSString*)start andString:(NSString*)end
{

  NSMutableArray* strings = [NSMutableArray arrayWithCapacity:0];

  NSRange startRange = [self rangeOfString:start];

  for( ;; )
  {

    if (startRange.location != NSNotFound)
    {

      NSRange targetRange;

      targetRange.location = startRange.location + startRange.length;
      targetRange.length = [self length] - targetRange.location;   

      NSRange endRange = [self rangeOfString:end options:0 range:targetRange];

      if (endRange.location != NSNotFound)
      {

        targetRange.length = endRange.location - targetRange.location;
        [strings addObject:[self substringWithRange:targetRange]];

        NSRange restOfString;

        restOfString.location = endRange.location + endRange.length;
        restOfString.length = [self length] - restOfString.location;

        startRange = [self rangeOfString:start options:0 range:restOfString];

      }
      else
      {
        break;
      }

    }
    else
    {
      break;
    }

  }

  return strings;

}
like image 21
Si. Avatar answered Nov 11 '22 10:11

Si.


See the ICU user guide on regular expressions.

If you know there'll just be one result:

NSRegularExpression *regex = [NSRegularExpression
    regularExpressionWithPattern:@"\bworld\s+(.+)\s+is\b" options:0 error:NULL]

NSTextCheckingResult *result = [regex firstMatchInString:string
    options:0 range:NSMakeRange(0, [string length]];

// Gets the string inside the first set of parentheses in the regex
NSString *inside = [string substringWithRange:[result rangeAtIndex:1]];

The \b makes sure there's a word boundary before world and after is (so "hello world this isn't a string" wouldn't match). The \s gobbles up any whitespace after world and before is. The .+? finds what you're looking for, with the ? making it non-greedy so that "hello world this is a string hello world this is a string" doesn't give you "this a string hello world this".

I'll leave it up to you to figure out how to handle multiple matches. The NSRegularExpression documentation should help you out.

If you want to make sure the match doesn't cross sentence boundaries, you could do ([^.]+?) instead of (.+?), or you could use enumerateSubstringsInRange:options:usingBlock: on your string and pass NSStringEnumerationBySentences in the options.

This stuff all needs 4.0+. If you want to support 3.0+, look into RegexKitLite.

like image 3
Jacques Avatar answered Nov 11 '22 11:11

Jacques