I have a list of 1500 emoji character dictionary in a json file, and I wanted to import those to my python code, I did a file read and convert it to a python dictionary but now I have only 143 records. How can I import all the emoji to my code, this is my code.
import sys
import ast
file = open('emojidescription.json','r').read()
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
emoji_dictionary = ast.literal_eval(file.translate(non_bmp_map))
#word = word.replaceAll(",", " ");
keys = list(emoji_dictionary["emojis"][0].keys())
values = list(emoji_dictionary["emojis"][0].values())
file_write = open('output.txt','a')
print(len(keys))
for i in range(len(keys)):
try:
content = 'word = word.replace("{0}", "{1}")'.format(keys[i],values[i][0])
except Exception as e:
content = 'word = word.replace("{0}", "{1}")'.format(keys[i],'')
#file.write()
#print(keys[i],values[i])
print(content)
file_write.close()
This is my input sample
{
"emojis": [
{
"👨🎓": ["Graduate"],
"©": ["Copy right"],
"®": ["Registered"],
"👨👩👧": ["family"],
"👩❤️💋👩": ["love"],
"™": ["trademark"],
"👨❤👨": ["love"],
"⌚": ["time"],
"⌛": ["wait"],
"⭐": ["star"],
"🐘": ["Elephant"],
"🐕": ["Cat"],
"🐜": ["ant"],
"🐔": ["cock"],
"🐓": ["cock"],
This is my result, and the 143 denotes number of emoji.
143
word = word.replace("����", "family")
word = word.replace("Ⓜ", "")
word = word.replace("♥", "")
word = word.replace("♠", "")
word = word.replace("⌛", "wait")
Emojis can also be implemented by using the emoji module provided in Python. To install it run the following in the terminal. emojize() function requires the CLDR short name to be passed in it as the parameter. It then returns the corresponding emoji.
Every emoji has a unique Unicode assigned to it. When using Unicode with Python, replace "+" with "000" from the Unicode. And then prefix the Unicode with "\". For example- U+1F605 will be used as \U0001F605.
To remove the emojis, we set the parameter no_emoji to True .
Emojis look like images, or icons, but they are not. They are letters (characters) from the UTF-8 (Unicode) character set. UTF-8 covers almost all of the characters and symbols in the world.
I'm not sure why you're seeing only 143 records from an input of 1500 (your sample doesn't seem to display this behavior).
The setup doesn't seem to do anything useful, but what you're doing boils down to (simplified and skipping lots of details):
d = ..read json as python dict.
keys = d.keys()
values = d.values()
for i in range(len(keys)):
key = keys[i]
value = values[i]
and that should be completely correct. There are better ways to do this in Python, however, like using the zip
function:
d = ..read json as python dict.
keys = d.keys()
values = d.values()
for key, value in zip(keys, values): # zip picks pair-wise elements
...
or simply asking the dict for its items:
for key, value in d.items():
...
The json
module makes reading and writing json much simpler (and safer), and using the idiom from above the problem reduces to this:
import json
emojis = json.load(open('emoji.json', 'rb'))
with open('output.py', 'wb') as fp:
for k,v in emojis['emojis'][0].items():
val = u'word = word.replace("{0}", "{1}")\n'.format(k, v[0] if v else "")
fp.write(val.encode('u8'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With