Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to extract all the substrings between two characters or tags

I need to extract all the strings surrounded by two characters (or maybe two tags)

this is what I've done so far:

    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"\\[(.*?)\\]" options:NSRegularExpressionCaseInsensitive error:NULL];

    NSArray *myArray = [regex matchesInString:@"[db1]+[db2]+[db3]" options:0 range:NSMakeRange(0, [@"[db1]+[db2]+[db3]" length])] ;

    NSLog(@"%@",[myArray objectAtIndex:0]);
    NSLog(@"%@",[myArray objectAtIndex:1]);
    NSLog(@"%@",[myArray objectAtIndex:2]);

In myArray there are correctly three objects but NSlog prints this:

<NSSimpleRegularExpressionCheckingResult: 0x926ec30>{0, 5}{<NSRegularExpression: 0x926e660> \[(.*?)\] 0x1}
<NSSimpleRegularExpressionCheckingResult: 0x926eb30>{6, 5}{<NSRegularExpression: 0x926e660> \[(.*?)\] 0x1}
<NSSimpleRegularExpressionCheckingResult: 0x926eb50>{12, 5}{<NSRegularExpression: 0x926e660> \[(.*?)\] 0x1}

instead of db1, db2 and db3

where I'm wrong?

like image 663
Gianluca Avatar asked Dec 04 '12 16:12

Gianluca


People also ask

How do I extract a string between two characters?

To extract part string between two different characters, you can do as this: Select a cell which you will place the result, type this formula =MID(LEFT(A1,FIND(">",A1)-1),FIND("<",A1)+1,LEN(A1)), and press Enter key. Note: A1 is the text cell, > and < are the two characters you want to extract string between.

How do I extract multiple substrings from a string in Python?

You can extract a substring in the range start <= x < stop with [start:step] . If start is omitted, the range is from the beginning, and if end is omitted, the range is to the end. You can also use negative values. If start > end , no error is raised and an empty character '' is extracted.

What does (? I do in regex?

All modes after the minus sign will be turned off. E.g. (? i-sm) turns on case insensitivity, and turns off both single-line mode and multi-line mode. Not all regex flavors support this.


1 Answers

According to the documentation matchesInString:options:range: returns an array of NSTextCheckingResults not NSStrings. You will need to loop over the results and use the ranges to get the substrings.

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"\\[(.*?)\\]" options:NSRegularExpressionCaseInsensitive error:NULL];

NSString *input = @"[db1]+[db2]+[db3]";
NSArray *myArray = [regex matchesInString:input options:0 range:NSMakeRange(0, [input length])] ;

NSMutableArray *matches = [NSMutableArray arrayWithCapacity:[myArray count]];

for (NSTextCheckingResult *match in myArray) {
     NSRange matchRange = [match rangeAtIndex:1];
     [matches addObject:[input substringWithRange:matchRange]];
     NSLog(@"%@", [matches lastObject]);
}
like image 183
Joe Avatar answered Sep 28 '22 02:09

Joe