Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find there is an emoji in a string in python3 [duplicate]

I want to check that a string contains only one emoji, using Python 3. For example, there is a is_emoji function that checks that the string has only one emoji.

def is_emoji(s):
    pass

is_emoji("😘") #True
is_emoji("😘◼️") #False

I try to use regular expressions but emojis didn't have fixed length. For example:

print(len("◼️".encode("utf-8"))) # 6 
print(len("😘".encode("utf-8"))) # 4
like image 796
Siyanew Avatar asked Mar 25 '16 08:03

Siyanew


People also ask

How do I know if text is emoji in Python?

For instance, human emoji followed by an "emoji modifier fitzpatrick type" should modify the colour of the preceding emoji; and certain emoji separated by a "zero width joiner" should be treated like a single character. This will check if the character is an emoji or not.

How do you filter emojis out of text in Python?

To remove the emojis, we set the parameter no_emoji to True .

How do you compare emojis in Python?

Try converting to a string first then encode that string. #convert to unicode teststring = unicode(teststring, 'utf-8') #encode it with string escape teststring = teststring. encode('unicode_escape') #then run check on test string.

How do you use emojis in Python strings?

Emojis can also be implemented by using the emoji module provided in Python. To install it run the following in the terminal. emojize() function requires the CLDR short name to be passed in it as the parameter. It then returns the corresponding emoji.


2 Answers

You could try using this emoji package. It's primarily used to convert escape sequences into unicode emoji, but as a result it contains an up to date list of emojis.

from emoji import UNICODE_EMOJI

def is_emoji(s):
    return s in UNICODE_EMOJI

There are complications though, as sometimes two unicode code points can map to one printable glyph. For instance, human emoji followed by an "emoji modifier fitzpatrick type" should modify the colour of the preceding emoji; and certain emoji separated by a "zero width joiner" should be treated like a single character.

like image 97
Dunes Avatar answered Oct 01 '22 10:10

Dunes


This works in Python 3:

def is_emoji(s):
    emojis = "😘◼️" # add more emojis here
    count = 0
    for emoji in emojis:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

Test:

>>> is_emoji("😘")
True
>>> is_emoji('◼')
True
>>> is_emoji("😘◼️")
False

Combine with Dunes' answer to avoid typing all emojis:

from emoji import UNICODE_EMOJI

def is_emoji(s):
    count = 0
    for emoji in UNICODE_EMOJI:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

This is not terrible fast because UNICODE_EMOJI contains nearly 1330 items, but it works.

like image 28
Mike Müller Avatar answered Oct 01 '22 09:10

Mike Müller