I get an error in a production system, which I fail to reproduce in a development environment: <pre class="prettyprint"><code>with io.open(file_name, 'wt') as fd: fd.write(data) </code></pre> Exception: <pre class="prettyprint"><code> File "/home/.../foo.py", line 18, in foo fd.write(data) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 6400: ordinal not in range(128) </code></pre> I already tried to but a lot of strange characters into the variable <code>data</code>. But up to now I was not able to reproduce an <code>UnicodeEncodeError</code>. What needs to be in <code>data</code> to get an <code>UnicodeEncodeError</code>? <h3>Update</h3> <pre class="prettyprint"><code>python -c 'import locale; print locale.getpreferredencoding()' UTF-8 </code></pre> <h3>Update2</h3> If I call <code>locale.getpreferredencoding()</code> via shell and via web request, the encoding is "UTF-8". I updated my exception handling in my code and log the <code>getpreferredencoding()</code> since some days. Now it happened again (up to now I am not able to force or reproduce this), and the encoding is "ANSI_X3.4-1968"! I have no clue where this encoding gets set .... This puts my problem into a different direction. Leaving this question useless. My problem is now: Where does the preferred encoding get altered? But this is not part of this question. A big thank you, for all who

You are relying on the default encoding for the platform; when that default encoding can't support the Unicode characters you are writing to the file, you get an encoding exception. From the <code>io.open()</code> documentation: <blockquote> encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever <code>locale.getpreferredencoding()</code> returns), but any encoding supported by Python can be used. </blockquote> For your specific situation, the default returned by <code>locale.getpreferredencoding()</code> is ASCII, so any Unicode character outside the ASCII range would cause this issue, U-0080 and up. Note that the locale is taken from your environment; if it is ASCII, that typically means the locale is set to the POSIX default locale, <code>C</code>. Specify the encoding explicitly: <pre class="prettyprint"><code>with io.open(file_name, 'wt', encoding='utf8') as fd: fd.write(data) </code></pre> I used UTF-8 as an example; what you pick depends entirely on your use cases and the data you are trying to write out.

How to reproduce UnicodeEncodeError?

Tags:

python

python-unicode

python-2.7

I get an error in a production system, which I fail to reproduce in a development environment:

Click to copy

with io.open(file_name, 'wt') as fd:
    fd.write(data)

Exception:

Click to copy

  File "/home/.../foo.py", line 18, in foo
    fd.write(data)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 6400: ordinal not in range(128)

I already tried to but a lot of strange characters into the variable data.

But up to now I was not able to reproduce an UnicodeEncodeError.

What needs to be in data to get an UnicodeEncodeError?

Update

Click to copy

python -c 'import locale; print locale.getpreferredencoding()'
UTF-8

Update2

If I call locale.getpreferredencoding() via shell and via web request, the encoding is "UTF-8".

I updated my exception handling in my code and log the getpreferredencoding() since some days. Now it happened again (up to now I am not able to force or reproduce this), and the encoding is "ANSI_X3.4-1968"!

I have no clue where this encoding gets set ....

This puts my problem into a different direction. Leaving this question useless. My problem is now: Where does the preferred encoding get altered? But this is not part of this question.

A big thank you, for all who

431

asked Jan 10 '17 11:01

guettli

1 Answers

You are relying on the default encoding for the platform; when that default encoding can't support the Unicode characters you are writing to the file, you get an encoding exception.

From the io.open() documentation:

encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any encoding supported by Python can be used.

For your specific situation, the default returned by locale.getpreferredencoding() is ASCII, so any Unicode character outside the ASCII range would cause this issue, U-0080 and up.

Note that the locale is taken from your environment; if it is ASCII, that typically means the locale is set to the POSIX default locale, C.

Specify the encoding explicitly:

Click to copy

with io.open(file_name, 'wt', encoding='utf8') as fd:
    fd.write(data)

I used UTF-8 as an example; what you pick depends entirely on your use cases and the data you are trying to write out.

answered Oct 17 '22 15:10

Martijn Pieters

Related questions
                            
                                Don't require all the positional arguments if an optional argument is present
                            
                                What permission/user does apache2 use to write django logs
                            
                                can i access a unix domain socket on a remote machine?
                            
                                Keras - Fusion of a Dense Layer with a Convolution2D Layer
                            
                                Pandas count consecutive date observations within groupby object
                            
                                What's the point of @staticmethod in Python?
                            
                                Debugging TensorFlow tests: pdb or gdb?
                            
                                How to use a python library that is constantly changing in a docker image or new container?
                            
                                Redefining python built-in function
                            
                                pandas - Selecting pair of consecutive rows matching criteria
                            
                                Fill na values by adding x to previous row pandas
                            
                                how to fill missing values with a tuple
                            
                                Check if tuple contains at least one of multiple values [duplicate]
                            
                                Array of Enum in Postgres with SQLAlchemy
                            
                                Resize Vertical Header of QTableView in PyQt4?
                            
                                Scroll in Selenium Webdriver (Python)
                            
                                Pandas - aggregate, sort and nlargest inside groupby
                            
                                Python multithreading list append gives unexpected results
                            
                                Bjoern v/s Gunicorn POST requests
                            
                                Highlighting multiple cells in different colors with Pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to reproduce UnicodeEncodeError?

Tags:

python

python-unicode

python-2.7

Update

Update2

guettli

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us