Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is it so much slower to export my data to .xlsx than to .xls or .csv?

I have a dataframe that I'm exporting to Excel, and people want it in .xlsx. I use to_excel, but when I change the extension from .xls to .xlsx, the exporting step takes about 9 seconds as opposed to 1 second. Exporting to a .csv is even faster, which I believe is due to the fact that it's just a specially formatted text file.

Perhaps the .xlsx files just added a lot more features so it takes longer to write to them, but I'm hoping there is something I can do to prevent this.

like image 953
Danny Avatar asked Nov 07 '25 06:11

Danny


1 Answers

Pandas defaults to using OpenPyXL for writing xlsx files which can be slower than than the xlwt module used for writing xls files.

Try it instead with XlsxWriter as the xlsx output engine:

df.to_excel('file.xlsx', sheet_name='Sheet1', engine='xlsxwriter')

It should be as fast as the xls engine.

like image 70
jmcnamara Avatar answered Nov 09 '25 20:11

jmcnamara