Detect if a user has typed an emoji character in UITextView

Tags:

I have a UITextView and I need to detect if a user enters an emoji character.

I would think that just checking the unicode value of the newest character would suffice but with the new emoji 2s, some characters are scattered all throughout the unicode index (i.e. Apple's newly designed copyright and register logos).

Perhaps something to do with checking the language of the character with NSLocale or LocalizedString values?

Does anyone know a good solution?

Thanks!

327

asked Jan 15 '13 00:01

Albert Renshaw

2 Answers

The following are cleaner and more efficient implementations of the code that checks to see if the drawn character has any color or not.

These have been written as category/extension methods to make them easier to use.

Objective-C:

NSString+Emoji.h:

#import <Foundation/Foundation.h>

@interface NSString (Emoji)

- (BOOL)hasColor;

@end

NSString+Emoji.m:

#import "NSString+Emoji.h"
#import <UIKit/UIKit.h>

@implementation NSString (Emoji)

- (BOOL)hasColor {
    UILabel *characterRender = [[UILabel alloc] initWithFrame:CGRectZero];
    characterRender.text = self;
    characterRender.textColor = UIColor.blackColor;
    characterRender.backgroundColor = UIColor.blackColor;//needed to remove subpixel rendering colors
    [characterRender sizeToFit];

    CGRect rect = characterRender.bounds;
    UIGraphicsBeginImageContextWithOptions(rect.size, YES, 1);
    CGContextRef contextSnap = UIGraphicsGetCurrentContext();
    [characterRender.layer renderInContext:contextSnap];
    UIImage *capturedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    CGImageRef imageRef = capturedImage.CGImage;
    size_t width = CGImageGetWidth(imageRef);
    size_t height = CGImageGetHeight(imageRef);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    size_t bytesPerPixel = 4;
    size_t bitsPerComponent = 8;
    size_t bytesPerRow = bytesPerPixel * width;
    size_t size = height * width * bytesPerPixel;
    unsigned char *rawData = (unsigned char *)calloc(size, sizeof(unsigned char));
    CGContextRef context = CGBitmapContextCreate(rawData, width, height,
                                                 bitsPerComponent, bytesPerRow, colorSpace,
                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
    CGColorSpaceRelease(colorSpace);

    CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
    CGContextRelease(context);

    BOOL result = NO;
    for (size_t offset = 0; offset < size; offset += bytesPerPixel) {
        unsigned char r = rawData[offset];
        unsigned char g = rawData[offset+1];
        unsigned char b = rawData[offset+2];

        if (r || g || b) {
            result = YES;
            break;
        }
    }

    free(rawData);

    return result;
}

@end

Example usage:

if ([@"😎" hasColor]) {
    // Yes, it does
}
if ([@"@" hasColor]) {
} else {
    // No, it does not
}

Swift:

String+Emoji.swift:

import UIKit

extension String {
    func hasColor() -> Bool {
        let characterRender = UILabel(frame: .zero)
        characterRender.text = self
        characterRender.textColor = .black
        characterRender.backgroundColor = .black
        characterRender.sizeToFit()
        let rect = characterRender.bounds
        UIGraphicsBeginImageContextWithOptions(rect.size, true, 1)

        let contextSnap = UIGraphicsGetCurrentContext()!
        characterRender.layer.render(in: contextSnap)

        let capturedImageTmp = UIGraphicsGetImageFromCurrentImageContext()
        UIGraphicsEndImageContext()
        guard let capturedImage = capturedImageTmp else { return false }

        let imageRef = capturedImage.cgImage!
        let width = imageRef.width
        let height = imageRef.height

        let colorSpace = CGColorSpaceCreateDeviceRGB()

        let bytesPerPixel = 4
        let bytesPerRow = bytesPerPixel * width
        let bitsPerComponent = 8
        let size = width * height * bytesPerPixel
        let rawData = calloc(size, MemoryLayout<CUnsignedChar>.stride).assumingMemoryBound(to: CUnsignedChar.self)

        guard let context = CGContext(data: rawData, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue) else { return false }

        context.draw(imageRef, in: CGRect(x: 0, y: 0, width: width, height: height))

        var result = false
        for offset in stride(from: 0, to: size, by: 4) {
            let r = rawData[offset]
            let g = rawData[offset + 1]
            let b = rawData[offset + 2]

            if (r > 0 || g > 0 || b > 0) {
                result = true
                break
            }
        }

        free(rawData)

        return result
    }
}

Example usage:

if "😎".hasColor() {
    // Yes, it does
}
if "@".hasColor() {
} else {
    // No, it does not
}

113

answered Sep 22 '22 14:09

2 revs

First let's address your "55357 method" – and why it works for many emoji characters.

In Cocoa, an NSString is a collection of unichars, and unichar is just a typealias for unsigned short which is the same as UInt16. Since the maximum value of UInt16 is 0xffff, this rules out quite a few emoji from being able to fit into one unichar, as only two out of the six main Unicode blocks used for emoji fall under this range:

Miscellaneous Symbols (U+2600–U+26FF)
Dingbats (U+2700–U+27BF)

These blocks contain 113 emoji, and an additional 66 emoji that can be represented as a single unichar can be found spread around various other blocks. However, these 179 characters only represent a fraction of the 1126 emoji base characters, the rest of which must be represented by more than one unichar.

Let's analyse your code:

unichar unicodevalue = [text characterAtIndex:0];

What's happening is that you're simply taking the first unichar of the string, and while this works for the previously mentioned 179 characters, it breaks apart when you encounter a UTF-32 character, since NSString converts everything into UTF-16 encoding. The conversion works by substituting the UTF-32 value with surrogate pairs, which means that the NSString now contains two unichars.

And now we're getting to why the number 55357, or 0xd83d, appears for many emoji: when you only look at the first UTF-16 value of a UTF-32 character you get the high surrogate, each of which have a span of 1024 low surrogates. The range for the high surrogate 0xd83d is U+1F400–U+1F7FF, which starts in the middle of the largest emoji block, Miscellaneous Symbols and Pictographs (U+1F300–U+1F5FF), and continues all the way up to Geometric Shapes Extended (U+1F780–U+1F7FF) – containing a total of 563 emoji, and 333 non-emoji characters within this range.

So, an impressive 50% of emoji base characters have the the high surrogate 0xd83d, but these deduction methods still leave 384 emoji characters unhandled, along with giving false positives for at least as many.

So, how can you detect whether a character is an emoji or not?

I recently answered a somewhat related question with a Swift implementation, and if you want to, you can look at how emoji are detected in this framework, which I created for the purpose of replacing standard emoji with custom images.

Anyhow, what you can do is extract the UTF-32 code point from the characters, which we'll do according to the specification:

- (BOOL)textView:(UITextView *)textView shouldChangeTextInRange:(NSRange)range replacementText:(NSString *)text {

    // Get the UTF-16 representation of the text.
    unsigned long length = text.length;
    unichar buffer[length];
    [text getCharacters:buffer];

    // Initialize array to hold our UTF-32 values.
    NSMutableArray *array = [[NSMutableArray alloc] init];

    // Temporary stores for the UTF-32 and UTF-16 values.
    UTF32Char utf32 = 0;
    UTF16Char h16 = 0, l16 = 0;

    for (int i = 0; i < length; i++) {
        unichar surrogate = buffer[i];

        // High surrogate.
        if (0xd800 <= surrogate && surrogate <= 0xd83f) {
            h16 = surrogate;
            continue;
        }
        // Low surrogate.
        else if (0xdc00 <= surrogate && surrogate <= 0xdfff) {
            l16 = surrogate;

            // Convert surrogate pair to UTF-32 encoding.
            utf32 = ((h16 - 0xd800) << 10) + (l16 - 0xdc00) + 0x10000;
        }
        // Normal UTF-16.
        else {
            utf32 = surrogate;
        }

        // Add UTF-32 value to array.
        [array addObject:[NSNumber numberWithUnsignedInteger:utf32]];
    }

    NSLog(@"%@ contains values:", text);

    for (int i = 0; i < array.count; i++) {
        UTF32Char character = (UTF32Char)[[array objectAtIndex:i] unsignedIntegerValue];
        NSLog(@"\t- U+%x", character);
    }

    return YES;
}

Typing "😎" into the UITextView writes this to console:

😎 contains values:
    - U+1f60e

With that logic, just compare the value of character to your data source of emoji code points, and you'll know exactly if the character is an emoji or not.

P.S.

There are a few "invisible" characters, namely Variation Selectors and zero-width joiners, that also should be handled, so I recommend studying those to learn how they behave.

answered Sep 20 '22 14:09

xoudini

Related questions
                            
                                How can I display views using autolayout constraints in Xcode playground?
                            
                                ios appid '' does not support changes to the 'Maps' feature
                            
                                How to name a back button in UISplitViewController
                            
                                How to programmatically read incoming text messages on iOS
                            
                                Upgrade's application-identifier entitlement string does not match installed application -> rejecting upgrade
                            
                                Pass UICollectionView touch event to its parent UITableViewCell
                            
                                Fresh react-native ios app not building?
                            
                                Swipe to delete on a tableView that is inside a pageViewController
                            
                                UITabbarController in Xcode 8 shows a blue rectangle inside a storyboard view
                            
                                Flutter 'Error running pod install' 'Pods-Runner' target has transitive dependencies
                            
                                Why does my Flutter app crash at startup on iOS?
                            
                                How to design and create a GridView using UIScrollView or UITableView
                            
                                Rotating Video with AVMutableVideoCompositionLayerInstruction
                            
                                Using built-in icons for mime type or UTI type in iOS
                            
                                Traversing key/values of an NSDictionary, is enumerateKeysAndObjectsUsingBlock more efficient than looping keys and calling objectForkey:?
                            
                                iOS UIWebView Javascript - insert data -receive callbacks?
                            
                                When is NS_RETURNS_RETAINED needed?
                            
                                How can I make my web app iPhone 5 compatible?
                            
                                How do I use auto layout to start a view in the half of its superview?
                            
                                uicollectionview select an item immediately after reloaddata?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Detect if a user has typed an emoji character in UITextView

Tags:

ios

objective-c

unicode

emoji

detect