Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NSRegularExpression:enumerateMatchesInString hangs when called more than once

In the context of an iPhone app I am developing, I am parsing some html to extract data to map, using NSRegularExpression. This information is updated whenever the user "pans" the map to a new location.

This works fine the first time or two through, but on the second or third time the function is called, the application hangs. I have used XCode's profiler to confirm I am not leaking memory, and no error is generated (the application does not terminate, it just sits in execution at the point shown below).

When I examine the HTML being parsed, I do not see that it is incomplete or otherwise garbled when the application hangs.

Furthermore, if I replace the regex code with a collection of explicitly address strings, everything works as expected.

- (void)connectionDidFinishLoading:(NSURLConnection *)connection {
   // receivedData contains the returned HTML
   NSString *result = [[NSString alloc] initWithData:receivedData encoding:NSASCIIStringEncoding];
   NSError *error = nil;
   NSString *pattern = @"description.*?h4>(.*?)<\\/h4>.*?\"address>[ \\s]*(.*?)<.*?zip>.*?(\\d{5,5}), US<";
   NSRegularExpression *regex = [NSRegularExpression         
                              regularExpressionWithPattern:pattern
                              options:NSRegularExpressionDotMatchesLineSeparators
                              error:&error];
   __block NSUInteger counter = 0;
   // the application hangs on the next line after 1-2 times through
   [regex enumerateMatchesInString:result options:0 range:NSMakeRange(0, [result length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
       NSRange range = [match rangeAtIndex:2];
       NSString *streetAddress =[result substringWithRange:range];
       range = [match rangeAtIndex:3];
       NSString *cityStateZip = [result substringWithRange:range];
       NSString *address = [NSString stringWithFormat:@"%@ %@", streetAddress, cityStateZip];
       EKItemInfo *party = [self addItem:address]; // geocode address and then map it
      if (++counter > 4) *stop = true;        
   }];
   [receivedData release];
   [result release];
   [connection release]; //alloc'd previously, so release here.
}

I realize this is going to be difficult/impossible to duplicate, but I was wondering if anyone has run into a similar issue with NSRegularExpression or if there is something obviously wrong here.

like image 465
Eric Avatar asked Jun 22 '11 04:06

Eric


1 Answers

I also have encountered the regular expression exception, too. In my case, the problem was Character Encoding. So that I wrote a code to go well with several character encoding. Maybe this code help you.

+ (NSString *)encodedStringWithContentsOfURL:(NSURL *)url
{
    // Get the web page HTML
    NSData *data = [NSData dataWithContentsOfURL:url];

    // response
    int enc_arr[] = {
        NSUTF8StringEncoding,           // UTF-8
        NSShiftJISStringEncoding,       // Shift_JIS
        NSJapaneseEUCStringEncoding,    // EUC-JP
        NSISO2022JPStringEncoding,      // JIS
        NSUnicodeStringEncoding,        // Unicode
        NSASCIIStringEncoding           // ASCII
    };
    NSString *data_str = nil;
    int max = sizeof(enc_arr) / sizeof(enc_arr[0]);
    for (int i=0; i<max; i++) {
        data_str = [
                    [NSString alloc]
                    initWithData : data
                    encoding : enc_arr[i]
                    ];
        if (data_str!=nil) {
            break;
        }
    }
    return data_str;    
}

You can download the whole category library from GitHub and just run it. I wish this helps you.

https://github.com/weed/p120801_CharacterEncodingLibrary

like image 121
Feel Physics Avatar answered Nov 10 '22 08:11

Feel Physics