Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript regular expression for Unicode emoji

Tags:

I want to replace all the emoji in a string with an icon. I successfully replaced these: {:) :D :P :3 <3 XP .... etc} to icons, so if the user writes :) in a string, it will be replaced with an icon.

But I have a problem: what if user directly pastes the Unicode 😊 which is equal to :)?

What I need: How can I change the Unicode icon to JavaScript regular expressions something like \ud800-\udbff. I have many emoji, so I need an idea about converting them, and after converting them, I want to match them with regular expressions.

Example: 😁wew😁
Change those emoji to \uD83D\uDE01|\uD83D\uDE4F|. I don't know how to change them, so I need to know how to change any emoji to those characters.

like image 217
Mohamed Mohamed Avatar asked Apr 05 '17 22:04

Mohamed Mohamed


People also ask

Does JavaScript regex support Unicode?

As mentioned in other answers, JavaScript regexes have no support for Unicode character classes.

What is the regex for Unicode?

To match a specific Unicode code point, use \uFFFF where FFFF is the hexadecimal number of the code point you want to match. You must always specify 4 hexadecimal digits E.g. \u00E0 matches à, but only when encoded as a single code point U+00E0.

Does regex work with emojis?

emoji-regex offers a regular expression to match all emoji symbols and sequences (including textual representations of emoji) as per the Unicode Standard. It's based on emoji-test-regex-pattern, which generates (at build time) the regular expression pattern based on the Unicode Standard.


2 Answers

In ECMAScript 6 you should be able to detect it in a fairly simple way. I have compiled a simple regex comprising of different Unicode blocks namely:

  • Miscellaneous Symbols and Pictographs
  • Supplemental Symbols and Pictographs
  • Emoticons
  • Transport and Map Symbols
  • Miscellaneous Symbols
  • Dingbats
  • Regional indicator symbol

Regex:

/[\u{1f300}-\u{1f5ff}\u{1f900}-\u{1f9ff}\u{1f600}-\u{1f64f}\u{1f680}-\u{1f6ff}\u{2600}-\u{26ff}\u{2700}-\u{27bf}\u{1f1e6}-\u{1f1ff}\u{1f191}-\u{1f251}\u{1f004}\u{1f0cf}\u{1f170}-\u{1f171}\u{1f17e}-\u{1f17f}\u{1f18e}\u{3030}\u{2b50}\u{2b55}\u{2934}-\u{2935}\u{2b05}-\u{2b07}\u{2b1b}-\u{2b1c}\u{3297}\u{3299}\u{303d}\u{00a9}\u{00ae}\u{2122}\u{23f3}\u{24c2}\u{23e9}-\u{23ef}\u{25b6}\u{23f8}-\u{23fa}]/ug

Playground: play around with emoji and regex

This answer doesn't directly answer the question but gives a fair insight on how to handle emoji using Unicode blocks and ES6.

like image 72
Suhail Gupta Avatar answered Sep 26 '22 22:09

Suhail Gupta


Use unicode property escapes like this:

/\p{Emoji_Presentation}/ug
like image 29
RonyHe Avatar answered Sep 24 '22 22:09

RonyHe