Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding How Excel Processes Array Functions

This question is based on an someone else's question that I answered. You can find the original question here, if desired. (And, maybe provide a better answer than I have.) However, everything you need to answer this question is right here.

Background

Given the following data in A1:C5

+-------------+--------+----------+
| Invoice Nbr |  Type  |  Status  |
+-------------+--------+----------+
| A0001       | Credit | Paid     |
| A0002       | Credit | Not Paid |
| B0001       | Debit  | Paid     |
| B0002       | Debit  | Not Paid |
+-------------+--------+----------+

The goal was to find a particular value in the table using an array formula that evaluated two columns. That is, give me the first Invoice (Column A) where Type is Debit and Status is Not Paid.

My first answer was an attempt to fix what I thought to be a problem with the OP's conditional. I put an AND wrapper around the two conditions as follows:

{=INDEX($A$2:$A$5,
        SMALL(IF(AND($B$2:$B$5 = "Debit", 
                     $C$2:$C$5 = "Not Paid"),
                 ROW($A$2:$A$5)-ROW($A$2)+1),
              1)
       )}

However, that didn't work.

I ended up suggesting this answer, because it actually works:

{=INDEX($A$2:$A$5,
        SMALL(IF($B$2:$B$5 & $C$2:$C$5 = "DebitNot Paid",
                 ROW($A$2:$A$5)-ROW($A$2)+1),
              1)
       )}

My Question

Array formulae in Excel are sometimes so much voodoo to me. It seems like either one should provide the result B0002. In fact, only the second formula gives the desired result. What principle or evaluation process is Excel (2013) following that makes this so? Or, in other words, what am I not understanding about how Excel manages array formulae?

like image 854
StoneGiant Avatar asked Sep 25 '18 18:09

StoneGiant


1 Answers

@StoneGiant is right. There are functions that can't be used inside array formulas because their sole "reason for being" is to turn an array into a scalar value. And sometimes you actually want those functions to return the single value inside an array formula, so how would the formula execution algorithm know which way you want it.

AND() and OR() are kind of good examples of this. =AND(array of values) returns a single value. MIN() is another example. Take the array formula

=SUM((A1:A10="Bob")*(Min(B1:B10))

MIN() is supposed to take a bunch of values and return the lowest. How would program MIN() to return an array? Where do you decide what's lowest in what context? You can't. That's why it doesn't work in array formulas.

Take SMALL() next. It can return an array formula because it's not only supposed to return the lowest value, but the nth lowest. Because of that nth argument, you can now return an array

=SMALL(A1:A10,{2,3})

That will return an array with two elements. MS could have programmed small to not work as an array formula, but in general I've found they've included array functionality wherever it was practical.

I said AND() and OR() are kind of good examples. MS could have programmed them to return an array like @StoneGiant's comment. It would have been a strange design decision to me, but they could have done it. But they already have AND and OR in array formulas. Multiply for AND and add for OR.

=MAX((B2:B5="Debit)*(C2:C5="Not Paid")*(ROW(B2:B5))

That will return the row number of the last line that's a debit and not paid. (You're example problem wanted the first one, but MIN introduces problems that complicate this example). The multiplications between the equalities is the same as AND. If you wanted the last row that's a debit and where column C says Not Paid or Not Sure, you could do that with addition.

=MAX((B2:B5="Debit)*((C2:C5="Not Paid")+(C2:C5="Not Sure"))*(ROW(B2:B5))

The plus sign is an OR between those two sets of parentheses. And the results of that OR is included against the other parentheses as an AND (multiplied).

You can put those in an INDEX

=INDEX(A2:A5, MAX((B2:B5="Debit)*(C2:C5="Not Paid")*(ROW(B2:B5)))

So to answer your question, I think: Generally, functions whose purpose is to turn a bunch of values into one value don't work inside array formulas. AND() and OR() are somewhat special because addition and subtraction already do that inside array formulas.

See more about arrays here http://dailydoseofexcel.com/archives/2004/04/05/anatomy-of-an-array-formula/

like image 182
Dick Kusleika Avatar answered Oct 11 '22 15:10

Dick Kusleika