Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove emoji code using javascript?

How do I remove emoji code using JavaScript? I thought I had taken care of it using the code below, but I still have characters like 🔴.

function removeInvalidChars() {     return this.replace(/[\uE000-\uF8FF]/g, ''); } 
like image 950
manraj82 Avatar asked Jun 12 '12 08:06

manraj82


People also ask

How do I get rid of emoji strings?

Instead of removing Emoji characters, you can only include alphabets and numbers. A simple tr should do the trick, . tr('^A-Za-z0-9', '') .

How do I turn off emoji?

Open any chat on any messenger app so the emoji bar will appear. Start typing and once you see the emoji bar, swipe left on it. You will see a Remove Bar button, tap on it and it will take you to settings. Here you can disable the Emoji fast-access bar toggle to disable that emoji bar completely.

Is regex an emoji?

emoji-regex offers a regular expression to match all emoji symbols and sequences (including textual representations of emoji) as per the Unicode Standard. It's based on emoji-test-regex-pattern, which generates (at build time) the regular expression pattern based on the Unicode Standard.

How do you use Emojis in Javascript?

To specify this emoji in HTML using the codepoint, we have to modify the value a bit. Add the &#x characters, remove the U+1 from the beginning of the codepoint, and just add the remaining digits from the codepoint as part of any text element.


2 Answers

For me none of the answers completely removed all emojis so I had to do some work myself and this is what i got :

text.replace(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g, ''); 

Also, it should take into account that if one inserting the string later to the database, replacing with empty string could expose security issue. instead replace with the replacement character U+FFFD, see : http://www.unicode.org/reports/tr36/#Deletion_of_Noncharacters

like image 80
jony89 Avatar answered Oct 11 '22 21:10

jony89


The range you have selected is the Private Use Area, containing non-standard characters. Carriers used to encode emoji as different, inconsistent values inside this range.

More recently, the emoji have been given standardised 'unified' codepoints. Many of these are outside of the Basic Multilingual Plane, in the block U+1F300–U+1F5FF, including your example 🔴 U+1F534 Large Red Circle.

You could detect these characters with [\U0001F300-\U0001F5FF] in a regex engine that supported non-BMP characters, but JavaScript's RegExp is not such a beast. Unfortunately the JS string model is based on UTF-16 code units, so you'd have to work with the UTF-16 surrogates in a regexp:

return this.replace(/([\uE000-\uF8FF]|\uD83C[\uDF00-\uDFFF]|\uD83D[\uDC00-\uDDFF])/g, '') 

However, note that there are other characters in the Basic Multilingual Plane that are used as emoji by phones but which long predate emoji. For example U+2665 is the traditional Heart Suit character ♥, but it may be rendered as an emoji graphic on some devices. It's up to you whether you treat this as emoji and try to remove it. See this list for more examples.

like image 33
bobince Avatar answered Oct 11 '22 21:10

bobince