Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

openpyxl: writing large excel files with python

I have a big problem here with python, openpyxl and Excel files. My objective is to write some calculated data to a preconfigured template in Excel. I load this template and write the data on it. There are two problems:

  1. I'm talking about writing Excel books with more than 2 millions of cells, divided into several sheets.
  2. I do this successfully, but the waiting time is unthinkable.

I don't know other way to solve this problem. Maybe openpyxl is not the solution. I have tried to write in xlsb, but I think openpyxl does not support this format. I have also tried with optimized writer and reader, but the problem comes when I save, due to the big data. However, the output file size is 10 MB, at most. I'm very stuck with this. Do you know if there is another way to do this?

Thanks in advance.

like image 619
DavidRguez Avatar asked Oct 01 '22 18:10

DavidRguez


People also ask

Can openpyxl write XLS files?

The openpyxl is a Python library to read and write Excel 2010 xlsx/xlsm/xltx/xltm files.

Which is better pandas or openpyxl?

Developers describe openpyxl as "A Python library to read/write Excel 2010 xlsx/xlsm files". A Python library to read/write Excel 2010 xlsx/xlsm files. On the other hand, pandas is detailed as "Powerful data structures for data analysis".

Is openpyxl or XlsxWriter better?

If you are working with large files or are particularly concerned about speed then you may find XlsxWriter a better choice than OpenPyXL. XlsxWriter is a Python module that can be used to write text, numbers, formulas and hyperlinks to multiple worksheets in an Excel 2007+ XLSX file.


1 Answers

The file size isn't really the issue when it comes to memory use but the number of cells in memory. Your use case really will push openpyxl to the limits at the moment which is currently designed to support either optimised reading or optimised writing but not both at the same time. One thing you might try would be to read in openpyxl with use_iterators=True this will give you a generator that you can call from xlsxwriter which should be able to write a new file for you. xlsxwriter is currently significantly faster than openpyxl when creating files. The solution isn't perfect but it might work for you.

like image 184
Charlie Clark Avatar answered Oct 13 '22 12:10

Charlie Clark