I need to split emoji from each other for example
EM = 'Hey π·π·π·'
EM.split()
If we split it we will have
['Hey' ,'π·π·π·']
I want to have
['hey' , 'π·' , 'π·' , 'π·']
and I want it to be applied to all emojis.
You should be able to use get_emoji_regexp
from the https://pypi.org/project/emoji/, together with the usual split
function . So something like:
import functools
import operator
import re
import emoji
em = 'Hey π·π·π·'
em_split_emoji = emoji.get_emoji_regexp().split(em)
em_split_whitespace = [substr.split() for substr in em_split_emoji]
em_split = functools.reduce(operator.concat, em_split_whitespace)
print(em_split)
outputs:
['Hey', 'π·', 'π·', 'π·']
A more complex case, with family, skin tone modifiers, and a flag:
em = 'Hey π¨βπ©βπ§βπ§π¨πΏπ·π·π¬π§'
em_split_emoji = emoji.get_emoji_regexp().split(em)
em_split_whitespace = [substr.split() for substr in em_split_emoji]
em_split = functools.reduce(operator.concat, em_split_whitespace)
for separated in em_split:
print(separated)
outputs:
Hey
π¨βπ©βπ§βπ§
π¨πΏ
π·
π·
π¬π§
(I think something's up with using print
on a list with the family emoji, hence printing each item of the list separately. Printing family emoji, with U+200D zero-width joiner, directly, vs via list)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With