Writing unicode strings to Excel 2007

Tags:

I am connecting to a MS SQL server using pyodbc. Furthermore, I am trying to write to an Excel 2007/10 .xlsx file using openpyxl.

This is my code (Python 2.7):

Click to copy

import pyodbc
from openpyxl import Workbook

cnxn = pyodbc.connect(host = 'xxx',database='yyy',user='zzz',password='ppp')
cursor = cnxn.cursor()

sql = "SELECT TOP 10   [customer clientcode] AS Customer, \
                [customer dchl] AS DChl, \
                [customer name] AS Name, \
                ...
                [name3] AS [name 3] \
        FROM   mydb \
        WHERE [customer dchl] = '03' \
        ORDER BY [customer id] ASC"

#load data
cursor.execute(sql)

#get colnames from openpyxl
columns = [column[0] for column in cursor.description]    

#using optimized_write cause it will be about 120k rows of data
wb = Workbook(optimized_write = True, encoding='utf-8')

ws = wb.create_sheet()
ws.title = '03'

#append column names to header
ws.append(columns)

#append rows to 
for row in cursor:
    ws.append(row)

wb.save(filename = 'test.xlsx')

cnxn.close()

This works, at least up until the point I encounter a customer with, for example, the name: "mún". My code does not fail, everything writes to Excel and all is well. That is until I actually open the Excel file- this causes an error saying that the file is corrupted and needs to be repaired. Upon repairing the file, all data is lost.

I know the code works for customers with regular names (only ASCII), it is as soon as there's an accented character or anything that the Excel file gets corrupted.

I have tried to print a single row (with a difficult cust. name). This is the result:

row is a tuple, and this one of the indices: 'Mee\xf9s Tilburg' So either writing the \xf9 (ú) character causes an error, or MS Excel cannot cope with it. I have tried various ways of encoding a row to unicode (unicode(row,'utf-8') or u''.join(row)) etc., though nothing works. Either I try something idiotic resulting in an error, or the Excel file still errors.

Any ideas?

876

asked Mar 08 '13 15:03

Rym

1 Answers

In the end I found two solutions:

First one was converting the row given by the cursor to a list, and decoding the elements within the list:

Click to copy

for row in cursor:
    l = list(row)
    l[5] = l[5].decode('ISO-8859-1')
    (do this for all neccesary cols)
    ws.append(l)

I figured this would have been hell, cause there were 6 columns needing conversion to unicode, and there were 120k rows, though everything went quite fast actually! In the end it became apparent that I could/should just cast the data in the sql statement to unicode ( cast(x as nvarchar) AS y) which made the replacements unnecessary. I did not think of this at first cause i thought that it was actually supplying the data in unicode. My bad.

197

answered Oct 03 '22 01:10

Rym

Related questions
                            
                                Naive Bayes Classifier error
                            
                                Vectorized year/month/day operations with NumPy datetime64
                            
                                Print and pexpect logging
                            
                                Convert contour (MatplotLib or OpenCV) to image of the same size as the original
                            
                                numpy's tostring/fromstring --- what do I need to specify to restore the array
                            
                                How can I read the memory of a process in python in linux?
                            
                                Difference between Popen.poll() and Popen.wait()
                            
                                fft in python not showing peaks in right place
                            
                                Changing data in a dataframe with hierarchical indexing
                            
                                Progressive enhancement with Django and Backbone - how to integrate the two?
                            
                                Is there a downside for using __init__(self) instead of setup(self) for a nose test class?
                            
                                scipy kdtree with meta data
                            
                                Python- How to configure and use Kinect
                            
                                Django-haystack: rebuild_index fails (haystack.exceptions.SearchFieldError) after adding `content_auto` line needed for autocomplete
                            
                                < > changed to &lt; and &gt; while parsing html with beautifulsoup in python
                            
                                install scrapy on win 7 (64-bit)
                            
                                pymongo: "OperationFailure: database error: error querying server"
                            
                                Enable pretty printing by default in IPython
                            
                                Python: can __file__ be None if import has succeeded?
                            
                                Basic Numpy array value assignment

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Writing unicode strings to Excel 2007

Tags:

python

excel

unicode

openpyxl

pyodbc

Rym

People also ask

1 Answers

Rym

Recent Activity

Donate For Us