Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError : 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

On one of my machines I have the error when I am working with google apps engine or django.

For example:

  • app.yaml

    application: demas1252c version: 1 runtime: python api_version: 1   handlers:    - url: /images static_dir: images    - url: /css static_dir: css    - url: /js static_dir: js    - url: /.* script: demas1252c.py 
  • demas1252c.py

    import cgi import wsgiref.handlers   from google.appengine.ext.webapp import template from google.appengine.ext import webapp   class MainPage(webapp.RequestHandler):  def get(self): values = {'id' : 10}   self.response.out.write(template.render('foto.html', values))   application = webapp.WSGIApplication([('/', MainPage)], debug = True) wsgiref.handlers.CGIHandler().run(application) 
  • foto.html

    <!DOCTYPE html> <html lang="en">     <head></head> <body>some</body> </html> 

error message:

C:\artefacts\dev\project>"c:\Program Files\Google\google_appengine\dev_appserver.py" foto-hosting Traceback (most recent call last):   File "c:\Program Files\Google\google_appengine\dev_appserver.py", line 69, in <module>     run_file(__file__, globals())   File "c:\Program Files\Google\google_appengine\dev_appserver.py", line 65, in run_file     execfile(script_path, globals_)   File "c:\Program Files\Google\google_appengine\google\appengine\tools\dev_appserver_main.py", line 92, in <module>     from google.appengine.tools import dev_appserver   File "c:\Program Files\Google\google_appengine\google\appengine\tools\dev_appserver.py", line 140, in <module>     mimetypes.add_type(mime_type, '.' + ext)   File "C:\Python27\lib\mimetypes.py", line 344, in add_type     init()   File "C:\Python27\lib\mimetypes.py", line 355, in init     db.read_windows_registry()   File "C:\Python27\lib\mimetypes.py", line 260, in read_windows_registry     for ctype in enum_types(mimedb):   File "C:\Python27\lib\mimetypes.py", line 250, in enum_types     ctype = ctype.encode(default_encoding) # omit in 3.x! UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128) 

When I am working with static files in django (without gae) I have very similar error (with different stack).

I tried to find the reason of error and added code to mimetypes.py:

print '=====' print ctype ctype = ctype.encode(default_encoding) # omit in 3.x! 

Then I get next messages in my console:

===== video/x-ms-wvx ===== video/x-msvideo ===== рєфшю/AMR Traceback (most recent call last): 

In the registry HKCR/Mime/Database/ContentType/ I have five keys with russian (cyrilic) letters. But how can I fix this error?

like image 666
ceth Avatar asked Nov 21 '10 12:11

ceth


People also ask

What is a Unicode decode error?

The UnicodeDecodeError normally happens when decoding an str string from a certain coding. Since codings map only a limited number of str strings to unicode characters, an illegal sequence of str characters will cause the coding-specific decode() to fail.

Is UTF-8 and ascii same?

For characters represented by the 7-bit ASCII character codes, the UTF-8 representation is exactly equivalent to ASCII, allowing transparent round trip migration. Other Unicode characters are represented in UTF-8 by sequences of up to 6 bytes, though most Western European characters require only 2 bytes3.


2 Answers

This is a bug in mimetypes, triggered by bad data in the registry. (рєфшю/AMR is not at all a valid MIME media type.)

ctype is a registry key name returned by _winreg.EnumKey, which mimetypes is expecting to be a Unicode string, but it isn't. Unlike _winreg.QueryValueEx, EnumKey returns a byte string (direct from the ANSI version of the Windows API; _winreg in Python 2 doesn't use the Unicode interfaces even though it returns Unicode strings, so it'll never read non-ANSI characters correctly).

So the attempt to .encode it fails with a Unicode​Decode​Error trying to get a Unicode string before encoding it back to ASCII!

try:     ctype = ctype.encode(default_encoding) # omit in 3.x! except UnicodeEncodeError:     pass 

These lines in mimetypes should simply be removed.

ETA: added to bug tracker.

like image 93
bobince Avatar answered Oct 11 '22 14:10

bobince


By the way, the main culpit of the problem is QuickTime which adds non-ascii mime types to the windows registry. The easiest way to fix it is to manually find and remove from the registry the subsections of the HKCR/Mime/Database/ContentType/ starting with аудио/ and видео/.

like image 28
newtover Avatar answered Oct 11 '22 14:10

newtover