My goal is to use enter this formula into Excel using Openpyxl.
=CORREL(IF(A2:A7;A2:A7);B2:B7)
The reason for the IF formula is to ignore the cells that have 0 inside, which are not ignored by the CORREL function by default.
When inserting this formula into excel the output is actually:
=CORREL(IF(@A2:A7;A2:A7);B2:B7)
This formula still works if the formula is entered in a cell which is on the same row of the range that is given (row 2-7 here). But gives #VALUE! when put on another row. Removing the "@" manually resolves the issue, but is not feasible because I want to use this for automating analyses.
I found out that the "@" is added for arrays in some cases by excel and replaces to "[#This Row]" from pre-2010 versions of Excel, which explains why the formula works when entered on the same row. The "@" symbol is also added to a formula by openpyxl when the formula is not recognized. However, both of these links do not explain why "@" is added in my example.
Is this a side-effect of using ranges in an IF function? Is this a bug(/feature) from parsing a range into an IF function?
Here is the code I used to replicate the error. My Openpyxl version is 3.0.5, I use Excel from Microsoft 365.
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
rows = [
    ['Size', 'Batch 1', 'Batch 2'],
    [2, 40, 30],
    [0, 40, 25],
    [0, 50, 30],
    [5, 30, 25],
    [6, 25, 35],
    [7, 20, 40],
]
for row in rows:
    ws.append(row)
formula = '=CORREL(IF(A2:A7,A2:A7),B2:B7)'
ws.cell(row=5, column=5, value=formula) # Formula in row 5 works
ws.cell(row=9, column=5, value=formula) # Formula in row 9 returns #VALUE!
formula = '=CORREL(A2:A7,B2:B7)'
ws.cell(row=6, column=5, value=formula) # CORREL function with ranges works fine
formula = '=IF(A2:A3,A2:A3)'
ws.cell(row=7, column=5, value=formula) # enters "=@IF(@A2:A3;A2:A3)"
formula = '=IF(A2,B2)'
ws.cell(row=8, column=5, value=formula) # enters "=IF(A2;B2)"
wb.save("test.xlsx")
In 'test.xlsx' this is my output (cell E5-E9)
-0.9528
-0.9528
#VALUE!
40
#VALUE!
And the formulae:
=CORREL(IF(@A2:A7;A2:A7);B2:B7)
=CORREL(A2:A7;B2:B7)
=@IF(@A2:A3;A2:A3)
=IF(A2;B2)
=CORREL(IF(@A2:A7;A2:A7);B2:B7)
Answering my own question here. Thanks to JvdV for putting me on the right path.
In this answer I found what solved my question. I added this line before saving the excel file in my example code:
ws.formula_attributes['E9'] = {'t': 'array', 'ref': "E9:E9"}
This essentially sets the formula in cell 'E9' to be read as an array formula, which is shown in Excel by having the formula between curly braces ({}). In older versions of Excel this is done by, pressing ctrl-shift-enter when entering the formula. Note that writing these curly braces in the formula string does not work.
If anyone knows how the 'ref' part works, feel free add this in the comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With