Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect emoji using javascript

People also ask

How do you check if a string has emojis?

This range of Unicode will match almost all the emoji's in a string. // Regular expression to match emoji const regexExp = /(\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff])/gi; Now let's write a string with some emojis.

Is regex an emoji?

emoji-regex offers a regular expression to match all emoji symbols and sequences (including textual representations of emoji) as per the Unicode Standard. It's based on emoji-test-regex-pattern, which generates (at build time) the regular expression pattern based on the Unicode Standard.


The answers might work but are terrible because they rely on unicode ranges that are unreadable and somewhat "magic" because it's not always clear where do they come from and why they work, not to mention they're not resilient to new emojis being added to the spec.

Major browsers now support unicode property escape which allows for matching emojis based on their belonging in the Emoji unicode category: \p{Emoji} matches an emoji, \P{Emoji} matches a non-emoji.

Note that officially, 0123456789#* and other characters are emojis too, so the property escape you might want to use is not Emoji but rather Extended_Pictographic which denotes all the characters typically understood as emojis!

Make sure to include the u flag at the end.

console.log(
  /\p{Emoji}/u.test('flowers'), // false :)
  /\p{Emoji}/u.test('flowers 🌼🌺🌸'), // true :)
  /\p{Emoji}/u.test('flowers 123'), // true :( 
)
console.log(
  /\p{Extended_Pictographic}/u.test('flowers'), // false :)
  /\p{Extended_Pictographic}/u.test('flowers 🌼🌺🌸'), // true :)
  /\p{Extended_Pictographic}/u.test('flowers 123'), // false :)
)

This works fine for detecting emojis, but if you want to use the same regex to extract them, you might be surprised with its behavior, since some emojis that appear as one character are actually several characters. They're what we call emoji sequences, more about them in this question

const regex = /\p{Extended_Pictographic}/ug
const family = '👨‍👩‍👧' // "family 
console.log(family.length) // not 1, but 8!
console.log(regex.test(family)) // true, as expected
console.log(family.match(regex)) // not [family], but [man, woman, girl]

You can use the following regex:

/(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|\ud83c[\ude32-\ude3a]|\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g

If you just want to remove it from the string, you can do something like this.

function removeEmojis (string) {
  var regex = /(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|\ud83c[\ude32-\ude3a]|\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g;

  return string.replace(regex, '');
}

You can use a regular expression from this lib emoji-regex


A simple function that returns true if your string contains one or more emojis.

function isEmoji(str) {
    var ranges = [
        '(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|[\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c[\ude32-\ude3a]|[\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])' // U+1F680 to U+1F6FF
    ];
    if (str.match(ranges.join('|'))) {
        return true;
    } else {
        return false;
    }
}

We can detect all list of surrogate pairs or the Emoji characters in a specific range. If the issue related with storing the input string to database like MySQL version before 5.5 we need to detect and remove all the surrogate pairs using the below regex

/([\uD800-\uDBFF][\uDC00-\uDFFF])/g.