Python write (iPhone) Emoji to a file

Tags:

I have been trying to write a simple script that can save user input (originating from an iPhone) to a text file. The issue I'm having is that when a user uses an Emoji icon, it breaks the whole thing.

OS: Ubuntu

Python Version: 2.7.3

My code currently looks like this

f = codecs.open(path, "w+", encoding="utf8")
f.write("Desc: " + json_obj["description"])
f.close()

When an Emoji character is passed in the description variable, I get the error:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)

Any possible help is appreciated.

722

asked Jul 08 '13 18:07

wtf_are_my_initials

1 Answers

The most likely problem here is that json_obj["description"] is actually a UTF-8-encoded str, not a unicode. So, when you try to write it to a codecs-wrapped file, Python has to decode it from str to unicode so it can re-encode it. And that's the part that fails, because that automatic decoding uses sys.getdefaultencoding(), which is 'ascii'.

For example:

>>> f = codecs.open('emoji.txt', 'w+', encoding='utf-8')
>>> e = u'\U0001f1ef'
>>> print e
🇯
>>> e
u'\U0001f1ef'
>>> f.write(e)
>>> e8 = e.encode('utf-8')
>>> e8
'\xf0\x9f\x87\xaf'
>>> f.write(e8)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128)

There are two possible solutions here.

First, you can explicitly decode everything to unicode as early as possible. I'm not sure where your json_obj is coming from, but I suspect it's not actually the stdlib json.loads, because by default, that always gives you unicode keys and values. So, replacing whatever you're using for JSON with the stdlib functions will probably solve the problem.

Second, you can leave everything as UTF-8 str objects and stay in binary mode. If you know you have UTF-8 everywhere, just open the file instead of codecs.open, and write without any encoding.

Also, you should strongly consider using io.open instead of codecs.open. It has a number of advantages, including:

Raises an exception instead of doing the wrong thing if you pass it incorrect values.
Often faster.
Forward-compatible with Python 3.
Has a number of bug fixes that will never be back-ported to codecs.

The only disadvantage is that it's not backward compatible to Python 2.5. Unless that matters to you, don't use codecs.

141

answered Sep 22 '22 08:09

abarnert

Related questions
                            
                                Cannot import modules of parent packages inside child packages
                            
                                Webstorm not refreshing modified JavaScript files
                            
                                Python global variable scoping
                            
                                pandas.to_datetime inconsistent time string format
                            
                                HDF5 file (h5py) with version control - hash changes on every save
                            
                                Catch signals in Flask Blueprint
                            
                                Busy box, Run C, python or Perl programs
                            
                                Python - R square and absolute sum of squares obtainable by scipy.optimize curve_fit?
                            
                                Python - typechecking OK when error wouldn't be thrown otherwise?
                            
                                C++ Introspection techniques, similar to python
                            
                                Pandas: Create new dataframe that averages duplicates from another dataframe
                            
                                send google hangout notification using python
                            
                                Django how to modify checkbox labels for MultipleChoiceField?
                            
                                Python: using scikit-learn to predict, gives blank predictions
                            
                                Plot timeseries of histograms in Python
                            
                                Parsing a txt file into a dictionary to write to csv file
                            
                                Python : why a method from super class not seen?
                            
                                Pandas: Why should appending to a dataframe of floats and ints be slower than if its full of NaN
                            
                                python raise KeyError message with color
                            
                                Implementing libPD (Pure Data wrapper) in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python write (iPhone) Emoji to a file

Tags:

python

encoding

unicode

python-2.7

emoji

wtf_are_my_initials

People also ask

1 Answers

abarnert

Recent Activity

Donate For Us