Many emoji characters are not read by python file read

Tags:

I have a list of 1500 emoji character dictionary in a json file, and I wanted to import those to my python code, I did a file read and convert it to a python dictionary but now I have only 143 records. How can I import all the emoji to my code, this is my code.

Click to copy

import sys
import ast

file = open('emojidescription.json','r').read()
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
emoji_dictionary = ast.literal_eval(file.translate(non_bmp_map))

#word = word.replaceAll(",", " ");

keys = list(emoji_dictionary["emojis"][0].keys())
values = list(emoji_dictionary["emojis"][0].values())

file_write = open('output.txt','a')

print(len(keys))
for i in range(len(keys)):
    try:
        content = 'word = word.replace("{0}", "{1}")'.format(keys[i],values[i][0])
    except Exception as e:
        content = 'word = word.replace("{0}", "{1}")'.format(keys[i],'')
    #file.write()
    #print(keys[i],values[i])
    print(content)


file_write.close()

This is my input sample

Click to copy

{

    "emojis": [
        {

            "👨‍🎓": ["Graduate"],
            "©": ["Copy right"],
            "®": ["Registered"],
            "👨‍👩‍👧": ["family"],
            "👩‍❤️‍💋‍👩": ["love"],
            "™": ["trademark"],
            "👨‍❤‍👨": ["love"], 
            "⌚": ["time"],
            "⌛": ["wait"], 
            "⭐": ["star"],
            "🐘": ["Elephant"],
            "🐕": ["Cat"],
            "🐜": ["ant"],
            "🐔": ["cock"],
            "🐓": ["cock"],

This is my result, and the 143 denotes number of emoji.

143

word = word.replace("�‍�‍�‍�", "family")

word = word.replace("Ⓜ", "")

word = word.replace("♥", "")

word = word.replace("♠", "")

word = word.replace("⌛", "wait")

977

asked Jun 10 '17 08:06

CDR

1 Answers

I'm not sure why you're seeing only 143 records from an input of 1500 (your sample doesn't seem to display this behavior).

The setup doesn't seem to do anything useful, but what you're doing boils down to (simplified and skipping lots of details):

Click to copy

d = ..read json as python dict.
keys = d.keys()
values = d.values()
for i in range(len(keys)):
    key = keys[i]
    value = values[i]

and that should be completely correct. There are better ways to do this in Python, however, like using the zip function:

Click to copy

d = ..read json as python dict.
keys = d.keys()
values = d.values()
for key, value in zip(keys, values):  # zip picks pair-wise elements
    ...

or simply asking the dict for its items:

Click to copy

for key, value in d.items():
    ...

The json module makes reading and writing json much simpler (and safer), and using the idiom from above the problem reduces to this:

Click to copy

import json

emojis = json.load(open('emoji.json', 'rb'))

with open('output.py', 'wb') as fp:
    for k,v in emojis['emojis'][0].items():
        val = u'word = word.replace("{0}", "{1}")\n'.format(k, v[0] if v else "")
        fp.write(val.encode('u8'))

157

answered Nov 01 '22 21:11

thebjorn

Related questions
                            
                                How to modify full text of some columns in pandas
                            
                                Is there a faster alternative to Python's strftime?
                            
                                How to make a Matplotlib animated violinplot?
                            
                                Python: How to fill out form all at once with splinter/Browser?
                            
                                How to import from sibling module in a package?
                            
                                pandas datetime set Sunday as first day of the week
                            
                                Object needs to have a value for field "id" before this many-to-many relationship can be used in Django
                            
                                Tkinter - Getting values from spinbox
                            
                                Convert CountVectorizer and TfidfTransformer Sparse Matrices into Separate Pandas Dataframe Rows
                            
                                How to add a legend to matplotlib scatter plot
                            
                                Google StackDrive Logging Level in containers with uwsgi always at Error Level
                            
                                Flask: session max size too small
                            
                                Statsmodels ARMA training data vs test data for prediction
                            
                                How to set a timeout for Input
                            
                                python - web scraping an ajax website using BeautifulSoup
                            
                                How to call ctypes functions that use pointer to return value in Numba @jit
                            
                                How to find rows with overlapping date ranges?
                            
                                Selenium3.4.0-Python3.6.1 : In Selenium-Python binding using unittest how do I decide when to use self.assertIn or assert
                            
                                pandas groupby weighted cumulative sum
                            
                                Telethon, how to get an entity?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Many emoji characters are not read by python file read

Tags:

python

unicode

python-unicode

emoji

CDR

People also ask

1 Answers

thebjorn

Recent Activity

Donate For Us