Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write excel 2003 files from python 3.x

I have a very large CSV data set (several million records). I have already filtered and massaged and split this list to a clients specification. This was all done in Python3.3

The last requirement is that these split lists be saved in Excel format. They have a utility that imports an Excel spreadsheet (in a specific format) into their database after doing some calculations and checking for existing duplicates in the DB. My problem is that their utility only works on Excel 2003 .xls files... I didn't know this ahead of time.

So I can already write the data in the correct format for Excel 2007 using OpenPyXl, but these files won't work. I can write CSV files but those don't work either, their importer needs xls files. Maybe there is a way to batch convert all the files from Excel 2007 xlsx format to xls format, or from csv format to xls format? There are thousands of files so it can't be done by hand.

The best thing to do would be output them in the correct format, but I can't seem to find a python 3 compatible way that will work with Excel 2003 format. xlwt is python 2.x only.

Does anyone have suggestions how I can finish this?

EDIT: This is what the solution looked like.

EDIT2: Added the workbook close as suggested by stenci.

import os
import errno
import glob 
import time 
import win32com.client    

def xlsx_to_xls(path):
    xlsx_files = glob.glob(path+'\\*.xlsx') 

    if len(xlsx_files) == 0: 
        raise RuntimeError('No XLSX files to convert.') 

    xlApp = win32com.client.Dispatch('Excel.Application') 

    for file in xlsx_files: 
        xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file)) 
        xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx')[0] + '.xls'), FileFormat=1) 
        xlWb.Close()

    xlApp.Quit() 

    time.sleep(2) # give Excel time to quit, otherwise files may be locked 
    for file in xlsx_files: 
        os.unlink(file) 
like image 998
trashrobber Avatar asked Jul 24 '13 14:07

trashrobber


People also ask

How do you write in an Excel file using Python?

XlsxWriter is a Python module for writing files in the XLSX file format. It can be used to write text, numbers, and formulas to multiple worksheets. Also, it supports features such as formatting, images, charts, page setup, auto filters, conditional formatting and many others.

How do you write to an Excel file in Python using pandas?

Use pandas to_excel() function to write a DataFrame to an excel sheet with extension . xlsx. By default it writes a single DataFrame to an excel file, you can also write multiple sheets by using an ExcelWriter object with a target file name, and sheet name to write to.

How read and write XLS file in Python?

Openpyxl is a Python library for reading and writing Excel (with extension xlsx/xlsm/xltx/xltm) files. The openpyxl module allows Python program to read and modify Excel files.

Can Python process Excel files?

Python is great for processing Excel-files. You can handle large files much easier, you create reproducible code and you provide a documentation for your colleagues. We also saw the we have easily access to advanced features of Python. You could automate your whole reporting process.


2 Answers

Open them with Excel 2007 and save them as Excel 2003. You can do it with a simple VBA macro, or from Python, without even showing the Excel application to the user. The only problem is that you need Excel in your computer.

Here is the VBA code:

Sub ConvertTo2003(FileName As String)
  Dim WB As Workbook
  Set WB = Workbooks.Open(FileName, ReadOnly:=True)
  WB.SaveAs Replace(FileName, ".xlsx", ".xls"), FileFormat:=xlExcel8
  WB.Close
End Sub

Here is the Python code:

xlApp = Excel.ExcelApp(False)
xlApp.convertTo2003('FileName.xlsx')

class ExcelApp(object):
    def __init__(self, visible):
        self.app = win32com.client.Dispatch('Excel.Application')
        if visible:
            self.app.Visible = True

    def __exit__(self):
        self.app.Quit()

    def __del__(self):
        self.app.Quit()

    def convertTo2003(self, fileName):
        if self.app:
            wb = self.app.WorkBooks.Open(fileName, ReadOnly = True)
            wb.SaveAs(fileName[:-1], FileFormat = 56)
            wb.Close()

    def quit(self):
        if self.app:
            self.app.Quit()
like image 68
stenci Avatar answered Sep 30 '22 17:09

stenci


The situation has changed since the question was first asked (and answered). As of version 1.0.0, xlwt does work with Python 3. As such, it is arguably the most straightforward option for outputting Excel 2003 workbooks, and definitely the preferred way if you don't have Excel handy.

like image 37
John Y Avatar answered Sep 30 '22 17:09

John Y