I have the string @"Hi there! \U0001F603"
, which correctly shows the emoji like Hi there! 😃
if I put it in a UILabel
.
But I want to create it dynamically like [NSString stringWithFormat:@"Hi there! \U0001F60%ld", (long)arc4random_uniform(10)]
, but it doesn't even compile.
If I double the backslash, it shows the Unicode value literally like Hi there! \U0001F605
.
How can I achieve this?
A step back, for a second: that number that you have, 1F660316, is a Unicode code point, which, to try to put it as simply as possible, is the index of this emoji in the list of all Unicode items. That's not the same thing as the bytes that the computer actually handles, which are the "encoded value" (technically, the code units.
When you write the literal @"\U0001F603"
in your code, the compiler does the encoding for you, writing the necessary bytes.* If you don't have the literal at compile time, you must do the encoding yourself. That is, you must transform the code point into a set of bytes that represent it. For example, in the UTF-16 encoding that NSString
uses internally, your code point is represented by the bytes ff fe 3d d8 03 de
.
You can't, at run time, modify that literal and end up with the correct bytes, because the compiler has already done its work and gone to bed.
(You can read in depth about this stuff and how it pertains to NSString
in an article by Ole Begemann at objc.io.)
Fortunately, one of the available encodings, UTF-32, represents code points directly: the value of the bytes is the same as the code point's. In other words, if you assign your code point number to a 32-bit unsigned integer, you've got proper UTF-32-encoded data.
That leads us to the process you need:
// Encoded start point
uint32_t base_point_UTF32 = 0x1F600;
// Generate random point
uint32_t offset = arc4random_uniform(10);
uint32_t new_point = base_point_UTF32 + offset;
// Read the four bytes into NSString, interpreted as UTF-32LE.
// Intel machines and iOS on ARM are little endian; others byte swap/change
// encoding as necessary.
NSString * emoji = [[NSString alloc] initWithBytes:&new_point
length:4
encoding:NSUTF32LittleEndianStringEncoding];
(N.B. that this may not work as expected for an arbitrary code point; not all code points are valid.)
*Note, it does the same thing for "normal" strings like @"b"
, as well.
\U0001F603
is a literal which is evaluated at compile time. You want a solution which can be executed at runtime.
So you want to have a string with a dynamic unicode character. %C
if the format specifier for a unicode character (unichar
).
[NSString stringWithFormat:@"Hi there! %C", (unichar)(0x01F600 + arc4random_uniform(10))];
unichar
is too small for emoji. Thanks @JoshCaswell for correcting me.
Update: a working answer
@JoshCaswell has the correct answer with -initWithBytes:length:encoding:
, but I think I can write a better wrapper.
Here is my answer
NSString *MyStringFromUnicodeCharacter(uint32_t character) {
uint32_t bytes = htonl(character); // Convert the character to a known ordering
return [[NSString alloc] initWithBytes:&bytes length:sizeof(uint32_t) encoding:NSUTF32StringEncoding];
}
So, in use…
NSString *emoji = MyStringFromUnicodeCharacter(0x01F600 + arc4random_uniform(10));
NSString *message = [NSString stringWithFormat:@"Hi there! %@", emoji];
Update 2
Finally, put in a category to make it real Objective-C.
@interface NSString (MyString)
+ (instancetype)stringWithUnicodeCharacter:(uint32_t)character;
@end
@implementation NSString (MyString)
+ (instancetype)stringWithUnicodeCharacter:(uint32_t)character {
uint32_t bytes = htonl(character); // Convert the character to a known ordering
return [[NSString alloc] initWithBytes:&bytes length:sizeof(uint32_t) encoding:NSUTF32StringEncoding];
}
@end
And again, in use…
NSString *emoji = [NSString stringWithUnicodeCharacter:0x01F600 + arc4random_uniform(10)];
NSString *message = [NSString stringWithFormat:@"Hi there! %@", emoji];
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With