I want to create a CSV file through Django that contains unicode data (greek characters) and I want it to be opened directly from MS Excel. Elsewhere I had read about the unicodecsv library and I decided to use that. So, here's my view;
def get_csv(request, id): response = HttpResponse(mimetype='text/csv') response['Content-Disposition'] = 'attachment; filename=csv.csv' writer = unicodecsv.writer(response, encoding='utf-16"') writer.writerow(['Second row', 'A', 'B', 'C', '"Testing"', "ελληνικά"]) return response
Now, except utf-16, I had really tried everything in the encoding parameter of the writer, including utf-8, utf-8-sig, utf-8-le, utf-16-le and maybe others. Everytime I opened the file with excel I always saw garbage where the greek characters should've been.
Notepad++ was able to open the file without problems. What am I doing wrong ?
Update: Here's what I tried after jd's answer:
import csv response = HttpResponse(mimetype='text/csv') response['Content-Disposition'] = 'attachment; filename=test.csv' response.write(u'\ufeff'.encode('utf8')) writer = csv.writer(response, delimiter=';' , dialect='excel') writer.writerow(['Second row', 'A', 'B', 'C', '"Testing"', "ελληνικά"]) return response
Still no luck - now I also can see the BOM (as grabage) in Excel - I also tried using unicodecsv and some other options but once again nothign worked :(
Update 2: I tried this after dda's proposal:
writer = unicodecsv.writer(response, delimiter=';' , dialect='excel') writer.writerow(codecs.BOM_UTF16_LE) writer.writerow([ (u'ελληνικά').decode('utf8').encode('utf_16_le')])
Still no luck :( Here's the error I get:
UnicodeEncodeError at /csv/559 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)
Update 3: I am going crazy. Why is it so difficult ??? Here's another try:
response.write(codecs.BOM_UTF16_LE) writer = unicodecsv.writer(response, delimiter=';' , lineterminator='\n', dialect='excel', ) writer.writerow('ελληνικ') writer.writerow([ ('ελληνικά').decode('utf8').encode('utf_16_le')]) #A writer.writerow([ ('ελληνικά2').decode('utf8').encode('utf_16_le'), ('ελληνικά2').decode('utf8').encode('utf_16_le') ]) #B
and here's the contents of the Excel:
㯎㮵㯎㮻㯎㮻㯎㮷㯎㮽㯎㮹㯎ελληνικά딊묃묃뜃봃뤃먃갃㈃딻묃묃뜃봃뤃먃갃㈃
So I got some greek characters with the line #A. But line B, which is exaclty the same didn't produce me greek characters $^#$#^$#$#^ @@%$#^#^$#$ Pls hlep !
With Python's csv
module, you can write a UTF-8 file that Excel will read correctly if you place a BOM at the beginning of the file.
with open('myfile.csv', 'wb') as f:
f.write(u'\ufeff'.encode('utf8'))
writer = csv.writer(f, delimiter=';', lineterminator='\n', quoting=csv.QUOTE_ALL, dialect='excel')
...
The same should work with unicodecsv
. I suppose you can write the BOM directly to the HttpResponse
object, if not you can use StringIO
to first write your file.
Edit:
Here's some sample code that writes a UTF-8 CSV file with non-ASCII characters. I'm taking Django out of the equation for simplicity. I can read that file in Excel.
# -*- coding: utf-8 -*-
import csv
import os
response = open(os.path.expanduser('~/utf8_test.csv'), 'wb')
response.write(u'\ufeff'.encode('utf8'))
writer = csv.writer(response, delimiter=';' , dialect='excel')
writer.writerow(['Second row', 'A', 'B', 'C', '"Testing"', u"ελληνικά".encode('utf8')])
response.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With