Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange `UnicodeEncodeError` using `os.path.exists`

In a web-application (using Flask), I get the following error:

Unable to retrieve the thumbnail for u'/var/data/uploads/2012/03/22/12 Gerd\xb4s Banjo Trio 1024.jpg'
Traceback (most recent call last):
 File "/var/www/beta/env/lib/python2.7/site-packages/dblib-1.0dev3-py2.7.egg/dblib/orm/file.py", line 169, in get_thumbnail
   if not exists(filename):
 File "/usr/lib/python2.7/genericpath.py", line 18, in exists
   os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb4' in position 52: ordinal not in range(128)

Note that I include the repr() of the file name in the logged error. This shows that the file name is passed as a Unicode instance. So much is correct...

If I run the culprit using the python interpreter, it works as expected:

>>> from os.path import exists
>>> exists(u'/var/data/uploads/2012/03/22/12 Gerd\xb4s Banjo Trio 1024.jpg')
True

So obviously, while running in the Flask environment, Python thinks it should encode the file-name using the ASCII codec instead of UTF-8. I deployed the application using mod_wsgi behind the Apache httpd.

I assume I have to tell either one of them to use UTF-8 somewhere? But where?

like image 293
exhuma Avatar asked May 01 '12 09:05

exhuma


1 Answers

See Django docs for same issue. When using mod_wsgi, should be same solution:

https://docs.djangoproject.com/en/dev/howto/deployment/wsgi/modwsgi/#if-you-get-a-unicodeencodeerror

Excerpt from the above linked doc:

[...] you must ensure that the environment used to start Apache is configured to accept non-ASCII file names. If your environment is not correctly configured, you will trigger UnicodeEncodeError exceptions when calling functions like the ones in os.path on filenames that contain non-ASCII characters.

To avoid these problems, the environment used to start Apache should contain settings analogous to the following:

export LANG='en_US.UTF-8'
export LC_ALL='en_US.UTF-8'

Consult the documentation for your operating system for the appropriate syntax and location to put these configuration items; /etc/apache2/envvars is a common location on Unix platforms. Once you have added these statements to your environment, restart Apache.

like image 197
Graham Dumpleton Avatar answered Oct 02 '22 23:10

Graham Dumpleton