Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError with Django's request.FILES

Tags:

python

django

I have the following code in the view call..

def view(request):
    body = u""  
    for filename, f in request.FILES.items():
        body = body + 'Filename: ' + filename + '\n' + f.read() + '\n'

On some cases I get

UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 7470: ordinal not in range(128)

What am I doing wrong? (I am using Django 1.1.)

Thank you.

like image 865
seyfettin sipsak Avatar asked Nov 09 '09 04:11

seyfettin sipsak


3 Answers

Django has some utilities that handle this (smart_unicode, force_unicode, smart_str). Generally you just need smart_unicode.

from django.utils.encoding import smart_unicode
def view(request):
    body = u""  
    for filename, f in request.FILES.items():
        body = body + 'Filename: ' + filename + '\n' + smart_unicode(f.read()) + '\n'
like image 75
Silfheed Avatar answered Nov 20 '22 20:11

Silfheed


you are appending f.read() directly to unicode string, without decoding it, if the data you are reading from file is utf-8 encoded use utf-8, else use whatever encoding it is in.

decode it first and then append to body e.g.

data = f.read().decode("utf-8")
body = body + 'Filename: ' + filename + '\n' + data + '\n'
like image 41
Anurag Uniyal Avatar answered Nov 20 '22 21:11

Anurag Uniyal


Anurag's answer is correct. However another problem here is you can't for certain know the encoding of the files that users upload. It may be useful to loop over a tuple of the most common ones till you get the correct one:

encodings = ('windows-xxx', 'iso-yyy', 'utf-8',)
for e in encodings:
    try:
        data = f.read().decode(e)
        break
    except UnicodeDecodeError:
        pass
like image 4
shanyu Avatar answered Nov 20 '22 21:11

shanyu