I have a dataframe with 12,000 rows and 34 columns. It takes around 15 sec for pandas to write this to the excel. I read few discussion about to_excel function and one way to make it faster is by adding engine='xlsxwriter'. I use the following code.
writer = pd.ExcelWriter('outputfile.xlsx',engine='xlsxwriter')
res_df.to_excel(writer,sheet_name='Output_sheet')
Wondering if there is a way to make this work faster using dask or any other library?
dataframe.memory_usage() gave me the following output:
Index 80
col1 95528
col2 95528
col3 95528
col4 95528
col5 95528
col6 95528
col7 95528
col8 95528
col9 95528
col10 95528
col11 95528
col12 95528
col13 95528
col14 95528
col15 95528
col16 95528
col17 95528
col18 95528
col19 95528
col20 95528
col21 95528
col22 95528
col23 95528
col24 95528
col25 95528
col26 95528
col27 95528
col28 95528
col29 95528
col30 95528
col31 95528
col32 95528
col33 95528
col34 95528
Thanks!
You can use pyexcelerate to get a much faster speed.
from pyexcelerate import Workbook
values = [res_df.columns] + list(res_df.values)
wb = Workbook()
wb.new_sheet('sheet name', data=values)
wb.save('outputfile.xlsx')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With