Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to separate emojis entered (through default keyboard) on textfield

Tags:

ios

swift

emoji

I entered a two emojis in textfield ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ˜, here I'm getting total number of 5 characters length whereas 4 characters for first emoji and 1 character for second. Looks like apple has combined 4 emojis to form a one.

I'm looking for the swift code where I can separate each of emojis separately, suppose by taking the above example I should be getting 2 strings/character separately for each emoji.

Can any one help me to solve this, I've tried many things like regex separation or componentsSeparatedByString or characterSet. but unfortunately ended up with negative.

Thanks in advance.

like image 688
Kiran Jasvanee Avatar asked Dec 30 '15 06:12

Kiran Jasvanee


2 Answers

Update for Swift 4 (Xcode 9)

As of Swift 4 (tested with Xcode 9 beta) a "Emoji ZWJ Sequence" is treated as a single Character as mandated by the Unicode 9 standard:

let str = "๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ˜"
print(str.count) // 2
print(Array(str)) //  ["๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง", "๐Ÿ˜"]

Also String is a collection of its characters (again), so we can call str.count to get the length, and Array(str) to get all characters as an array.


(Old answer for Swift 3 and earlier)

This is only a partial answer which may help in this particular case.

"๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง" is indeed a combination of four separate characters:

let str = "๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ˜" //
print(Array(str.characters))

// Output: ["๐Ÿ‘จโ€", "๐Ÿ‘จโ€", "๐Ÿ‘งโ€", "๐Ÿ‘ง", "๐Ÿ˜"]

which are glued together with U+200D (ZERO WIDTH JOINER):

for c in str.unicodeScalars {
    print(String(c.value, radix: 16))
}

/* Output:
1f468
200d
1f468
200d
1f467
200d
1f467
1f60d
*/

Enumerating the string with the .ByComposedCharacterSequences options combines these characters correctly:

var chars : [String] = []
str.enumerateSubstringsInRange(str.characters.indices, options: .ByComposedCharacterSequences) {
    (substring, _, _, _) -> () in
    chars.append(substring!)
}
print(chars)

// Output: ["๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง", "๐Ÿ˜"]

But there are other cases where this does not work, e.g. the "flags" which are a sequence of "Regional Indicator characters" (compare Swift countElements() return incorrect value when count flag emoji). With

let str = "๐Ÿ‡ฉ๐Ÿ‡ช"

the result of the above loop is

["๐Ÿ‡ฉ", "๐Ÿ‡ช"]

which is not the desired result.

The full rules are defined in "3 Grapheme Cluster Boundaries" in the "Standard Annex #29 UNICODE TEXT SEGMENTATION" in the Unicode standard.

like image 177
Martin R Avatar answered Sep 23 '22 15:09

Martin R


You can use this code example or this pod.

To use it in Swift, import the category into the YourProject_Bridging_Header

#import "NSString+EMOEmoji.h"

Then you can check the range for every emoji in your String:

let example: NSString = "๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ˜" // your string

let ranges: NSArray = example.emo_emojiRanges()  // ranges of the emojis

for value in ranges {

   let range:NSRange = (value as! NSValue).rangeValue

    print(example.substringWithRange(range))
}


// Output: ["๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง", "๐Ÿ˜"]

I created an small example project with the code above.

For further reading, this interesting article from Instagram.

like image 28
Gabriel.Massana Avatar answered Sep 23 '22 15:09

Gabriel.Massana