Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Text string with EMOJI causing issues with NSRange

I am using TTTAttributedLabel to apply formatting to text, however it seems to crash because I am trying to apply formatting to a range which includes emoji. Example:

NSString *text = @"@user1234 🍺🍺 #hashtag"; // text.length reported as 22 by NSLog as each emoji is 2 chars in length
cell.textLabel.text = text;

int length = 8;
int start = 13;

NSRange *range = NSMakeRange(start, length);

if (!NSEqualRanges(range, NSMakeRange(NSNotFound, 0))) {
    // apply formatting to TTTAttributedLabel
    [cell.textLabel addLinkToURL:[NSURL URLWithString:[NSString stringWithFormat:@"someaction://hashtag/%@", [cell.textLabel.text substringWithRange:range]]] withRange:range];
}

Note: I am passed the NSRange values from an API, as well as the text string.

In the above I am attempting to apply formatting to #hashtag. Normally this works fine, but because I have emoji involved in the string, I believe the range identified is attempting to format the emoji, as they are actually UTF values, which in TTTAttributedLabel causes a crash (it actually hangs with no crash, but...)

Strangely, it works fine if there is 1 emoji, but breaks if there are 2.

Can anyone help me figure out what to do here?

like image 289
mootymoots Avatar asked Mar 09 '13 15:03

mootymoots


2 Answers

I assume this is from the Twitter API, and you are trying to use the entities dictionary they return. I have just been writing code to support handling those ranges along with NSString's version of the range of a string.

My approach was to "fix" the entities dictionary that Twitter return to cope with the extra characters. I can't share code, for various reasons, but this is what I did:

  1. Make a deep mutable copy of the entities dictionary.
  2. Loop through the entire range of the string, unichar by unichar, doing this:
    1. Check if the unichar is in the surrogate pair range (0xd800 -> 0xdfff).
    2. If it is a surrogate pair codepoint, then go through all the entries in the entities dictionary and shift the indices by 1 if they are greater than the current location in the string (in terms of unichars). Then increment the loop counter by 1 to skip the partner of this surrogate pair as it's been handled now.
    3. If it's not a surrogate pair, do nothing.
  3. Loop through all entities and check that none of them overrun the end of the string. They shouldn't, but just incase. I found some cases where Twitter returned duff data.

I hope that helps! I also hope that one day I can open source this code as I think it would be incredibly useful!

like image 42
mattjgalloway Avatar answered Sep 22 '22 10:09

mattjgalloway


The problem is that any Unicode character in your string with a Unicode value of \U10000 or higher will appears as two characters in NSString.

Since you want to format the hashtag, you should use more dynamic ways to obtain the start and length values. Use NSString rangeOfString to find the location of the # character. Use that results and the string's length to get the needed length.

NSString *text = @"@user1234 🍺🍺 #hashtag"; // text.length reported as 22 by NSLog as each emoji is 2 chars in length
cell.textLabel.text = text;

NSUInteger start = [text rangeOfString:@"#"];
if (start != NSNotFound) {
    NSUInteger length = text.length - start;
    NSRange *range = NSMakeRange(start, length);
    // apply formatting to TTTAttributedLabel
    [cell.textLabel addLinkToURL:[NSURL URLWithString:[NSString stringWithFormat:@"someaction://hashtag/%@", [cell.textLabel.text substringWithRange:range]]] withRange:range];
}
like image 197
rmaddy Avatar answered Sep 22 '22 10:09

rmaddy