Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save Pandas DataFrames with formulas to xlsx files

In a Pandas DataFrame i have some "cells" with values and some that need to contain excel formulas. I have read that i can get formulas with

link = 'HYPERLINK("#Groups!A' + str(someInt) + '"; "LINKTEXT")'
xlwt.Formula(link)

and store them in the dataframe.

When i try to save my dataframe as an xlsx file with

writer = pd.ExcelWriter("pandas" + str(fileCounter) + ".xlsx", engine = "xlsxwriter")
df.to_excel(writer, sheet_name = "Paths", index = False)
# insert more sheets here
writer.save()

i get the error:

TypeError: Unsupported type <class 'xlwt.ExcelFormula.Formula'> in write()

So i tried to write my formula as a string to my dataframe but Excel wants to restore the file content and then fills all formula cells with 0's.

Edit: I managed to get it work with regular strings but nevertheless would be interested in a solution for xlwt formulas.

So my question is: How do i save dataframes with formulas to xlsx files?

like image 625
Samuel Blickle Avatar asked Jul 15 '18 13:07

Samuel Blickle


People also ask

How do I write pandas DataFrames to an existing Excel spreadsheet?

Use pandas to_excel() function to write a DataFrame to an excel sheet with extension . xlsx. By default it writes a single DataFrame to an excel file, you can also write multiple sheets by using an ExcelWriter object with a target file name, and sheet name to write to.

Can pandas create Excel file?

Pandas writes Excel files using the Xlwt module for xls files and the Openpyxl or XlsxWriter modules for xlsx files.


2 Answers

Since you are using xlsxwriter, strings are parsed as formulas by default ("strings_to_formulas: Enable the worksheet.write() method to convert strings to formulas. The default is True"), so you can simply specify formulas as strings in your dataframe.

Example of a formula column which references other columns in your dataframe:

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
writer = pd.ExcelWriter("foo.xlsx", engine="xlsxwriter")
df["product"] = None
df["product"] = (
    '=INDIRECT("R[0]C[%s]", 0)+INDIRECT("R[0]C[%s]", 0)'
    % (
        df.columns.get_loc("col1") - df.columns.get_loc("product"),
        df.columns.get_loc("col2") - df.columns.get_loc("product"),
    )
)
df.to_excel(writer, index=False)
writer.save()

Produces the following output:

Example output in LibreOffice

like image 76
Motin Avatar answered Oct 17 '22 02:10

Motin


After writing the df using table.to_excel(writer, sheet_name=...), I use write_formula() as in this example (edited to add the full loop). To write all the formulas in your dataframe, read each formula in your dataframe.

 # replace the right side below with reading the formula from your dataframe
 # e.g., formula_to_write = df.loc(...)`

 rows = table.shape[0]
 for row_num in range(1 + startrow, rows + startrow + 1):
    formula_to_write = '=I{} * (1 - AM{})'.format(row_num+1, row_num+1) 
    worksheet.write_formula(row_num, col, formula_to_write)`

Later in the code (I seem to recall one of these might be redundant, but I haven't looked it up): writer.save() workbook.close()

Documentation is here.

like image 29
David Gaertner Avatar answered Oct 17 '22 00:10

David Gaertner