Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django upload and handle CSV file with right encoding

Tags:

python

django

I try to upload and handle a CSV file in my Django project, but I get an encoding error, the CSV file is created on a mac with excel..

reader = csv.reader(request.FILES['file'].read().splitlines(), delimiter=";")
    if withheader:
        reader.next()

data = [[field.decode('utf-8') for field in row] for row in reader]

With this code example i get an error: http://puu.sh/1VmXc

If I use latin-1 decode i get an other "error"..

data = [[field.decode('latin-1') for field in row] for row in reader]

the result is: v¾gmontere and the result should be: vægmontere

Anyone know what to do? .. i have tried a lot!

like image 618
pkdkk Avatar asked Oct 06 '22 04:10

pkdkk


1 Answers

  1. The Python 2 csv module comes with lots of unicode hassle. Try unicodecsv instead or use Python 3.
  2. Excel on Mac exports to CSV with broken encoding. Don't use it, use something useful like LibreOffice instead (has a much better CSV export with options).
  3. When handling user files: either make sure files are consistently encoded in UTF-8 and only decode to UTF-8 (recommended) or use an encoding detection library like chardet.
like image 170
stefanw Avatar answered Oct 12 '22 22:10

stefanw