Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separating NSString into NSArray, but allowing quotes to group words

I have a search string, where people can use quotes to group phrases together, and mix this with individual keywords. For example, a string like this:

"Something amazing" rooster

I'd like to separate that into an NSArray, so that it would have Something amazing (without quotes) as one element, and rooster as the other.

Neither componentsSeparatedByString nor componentsSeparatedByCharactersInSet seem to fit the bill. Is there an easy way to do this, or should I just code it up myself?

like image 679
Tim Sullivan Avatar asked Aug 25 '11 15:08

Tim Sullivan


4 Answers

You probably will have to code some of this up yourself, but the NSScanner should be a good basis on which to build. If you use the scanUpToCharactersInSet method to look for everything up to your next whitespace or quote character to can pick off words. Once you encounter a quite character, you could continue to scan using just the quote in the character set to end at, so that spaces within the quotes don't result in the end of a token.

like image 139
Tim Dean Avatar answered Nov 07 '22 17:11

Tim Dean


I made a simple way to do this using NSScanner:

+ (NSArray *)arrayFromTagString:(NSString *)string {

NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *substring;
NSMutableArray *array = [[NSMutableArray alloc] init];

while (scanner.scanLocation < string.length) {

    // test if the first character is a quote
    unichar character = [string characterAtIndex:scanner.scanLocation];
    if (character == '"') {
        // skip the first quote and scan everything up to the next quote into a substring
        [scanner setScanLocation:(scanner.scanLocation + 1)];
        [scanner scanUpToString:@"\"" intoString:&substring];
        [scanner setScanLocation:(scanner.scanLocation + 1)];  // skip the second quote too
    }
    else {
        // scan everything up to the next space into the substring
        [scanner scanUpToString:@" " intoString:&substring];
    }
    // add the substring to the array
    [array addObject:substring];

    //if not at the end, skip the space character before continuing the loop
    if (scanner.scanLocation < string.length) [scanner setScanLocation:(scanner.scanLocation + 1)];
}
return array.copy;

}

This method will convert the array back to a tag string, re-quoting the multi-word tags:

+ (NSString *)tagStringFromArray:(NSArray *)array {

NSMutableString *string = [[NSMutableString alloc] init];
NSRange range;

for (NSString *substring in array) {
    if (string.length > 0) {
        [string appendString:@" "];
    }
    range = [substring rangeOfString:@" "];
    if (range.location != NSNotFound) {
        [string appendFormat:@"\"%@\"", substring];
    }
    else [string appendString:substring];
}
return string.description;

}

like image 41
Elmer Cat Avatar answered Nov 07 '22 18:11

Elmer Cat


I ended up going with a regular expression as I was already using RegexKitLite, and creating this NSString+SearchExtensions category.

.h:

//  NSString+SearchExtensions.h
#import <Foundation/Foundation.h>
@interface NSString (SearchExtensions)
-(NSArray *)searchParts;
@end

.m:

//  NSString+SearchExtensions.m
#import "NSString+SearchExtensions.h"
#import "RegexKitLite.h"

@implementation NSString (SearchExtensions)

-(NSArray *)searchParts {
    __block NSMutableArray *items = [[NSMutableArray alloc] initWithCapacity:5];

    [self enumerateStringsMatchedByRegex:@"\\w+|\"[\\w\\s]*\"" usingBlock: ^(NSInteger captureCount,
       NSString * const capturedStrings[captureCount],
       const NSRange capturedRanges[captureCount],
       volatile BOOL * const stop) {

        NSString *result = [capturedStrings[0] stringByReplacingOccurrencesOfRegex:@"\"" withString:@""];

        NSLog(@"Match: '%@'", result);
        [items addObject:result];
    }];        
    return [items autorelease];
}
@end

This returns an NSArray of strings with the search strings, removing the double quotes that surround the phrases.

like image 1
Tim Sullivan Avatar answered Nov 07 '22 17:11

Tim Sullivan


If you'll allow a slightly different approach, you could try Dave DeLong's CHCSVParser. It is intended to parse CSV strings, but if you set the space character as the delimiter, I am pretty sure you will get the intended behavior.

Alternatively, you can peek into the code and see how it handles quoted fields - it is published under the MIT license.

like image 1
Monolo Avatar answered Nov 07 '22 17:11

Monolo