Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save in openpyxl without losing formulae?

Because I need to parse and then use the actual data in cells, I open an xlsm in openpyxl with data_only = True.

This has proved very useful. Now though, having the same need for an xlsm that contains formuale in cells, when I then save my changes, the formulae are missing from the saved version.

Are data_only = True and formulae mutually exclusive? If not, how can I access the actual value in cells without losing the formulae when I save?

When I say I lose the formulae, it seems that the results of the formulae (sums, concatenattions etc.) get preserved. But the actual formulaes themselves are no longer displayed when a cell is clicked.

UPDATE:

To confirm whether or not the formulaes were being preserved or not, I've re-opened the saved xlsm, this time with data_only left as False. I've checked the value of a cell that had been constructed using a formula. Had formulae been preserved, opening the xlsm with data_only set to False should have return the formula. But it returns the actual text value (which is not what I want).

like image 946
Pyderman Avatar asked Sep 25 '15 00:09

Pyderman


People also ask

Which is better pandas or openpyxl?

According to the StackShare community, pandas has a broader approval, being mentioned in 41 company stacks & 83 developers stacks; compared to openpyxl, which is listed in 7 company stacks and 7 developer stacks.

Is openpyxl maintained?

Further analysis of the maintenance status of openpyxl based on released PyPI versions cadence, the repository activity, and other data points determined that its maintenance is Sustainable.


2 Answers

If you want to preserve the integrity of the workbook, ie. retain the formulae, the you cannot use data_only=True. The documentation makes this very clear.

like image 84
Charlie Clark Avatar answered Oct 19 '22 12:10

Charlie Clark


Part of your question was: Are data_only = True and formulae mutually exclusive?

The answer to that, in openpyxl, is yes.

But this is not intrinsic to Excel. You could have a library like openpyxl which gives you access to both the formulas and their results. This is unlikely to happen, since the maintainer(s) of openpyxl are philosophically opposed to this idea.

So, how you're expected to handle your kind of situation in openpyxl is to load the workbook twice: once with data_only=True just to read the data (which you keep in memory), then load it again as a "different" workbook with data_only=False to get a writable version.

The "canonical" way of modifying an existing workbook with Python while preserving everything else (including formatting, formulas, charts, macros, etc.) is to use a COM interface (such as PyWin32, or higher-level wrappers like pywinauto or xlwings) to control a running instance of Excel. Of course, this is only possible if you are running on a machine with Excel installed.

like image 24
John Y Avatar answered Oct 19 '22 13:10

John Y