Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Non-contiguous" range specification in formula

[2018-08-01: See new material near bottom of post.]

I just ran across a Q-n-A here that included an example of a function referencing a range using a specification I had not seen before. I played around with it a little and discovered that most Excel functions will return a value when using this kind of reference:

=SUM(A1:A5:C1:C5:E1:E2:F3:F4)

Notice all the ":"s in there.

With the cursor in the formula bar, those non-contiguous cells are highlighted, as if only those cells would be totaled. However, when I experimented, I discovered that it is equivalent to this formula:

=SUM(A1:F5)

However, this formula does what I expect:

=SUM(A1:A5,C1:C5,E1:E2,F3:F4)

Example: I have the range A1:F5 defined as:

--  A   B   C   D   E   F

1   1   2   3   4   5   6

2   2   4   6   8   10  12

3   3   6   9   12  15  18

4   4   8   12  16  20  24

5   5   10  15  20  25  30 

The formula results are:

=SUM(A1:F5) is 315

=SUM(A1:A5:C1:C5:E1:E2:F3:F4) is 315

=SUM(A1:A5,C1:C5,E1:E2,F3:F4) is 117

This illustrates my point that the first two seem to be equivalent, and the third "correct" way of defining a discontiguous range gives a different result (but what I would expect).

I looked around here and elsewhere on the Internet and found no relevant discussion.

So, my questions:

  1. Is that odd range reference of any real use?
  2. is it really a valid sort of range reference?

[New material 2018-08-01]

@YowE3K wondered what would happen with an odd number of cells. Here is my example:

--  A   B   C  

1   1   1   1   

2   1   1   1   

3   1   1   1   

4   1   1   1   

Now, in a faraway cell, I enter this formula:

=SUM(A1:B3:$C$2)

Then if I drag the formula around, it creates interesting results, basically the 3-row, 2-column range extended to a rectangular range that always includes the fixed cell. The smallest sum you can ever get in this example is 3. Try it and see.

This shows a glimmer of something useful, though I haven't come up with a concrete use for it yet.

like image 557
vknowles Avatar asked Jan 14 '17 19:01

vknowles


People also ask

What is a non contiguous range in Excel?

A non-contiguous range consists of two or more separate blocks of cells. These blocks can be separated by rows or columns as shown by the ranges A1 to A5 and C1 to C5.

How do you select both contiguous and non contiguous ranges?

To select a range, select a cell, then with the left mouse button pressed, drag over the other cells. Or use the Shift + arrow keys to select the range. To select non-adjacent cells and cell ranges, hold Ctrl and select the cells.

What represents a continuous range of cells?

Active cells that are all adjacent to each other signify a continuous range. To define a continuous range, click on an active cell and drag your cursor to highlight neighboring cells. Active cells that have gaps between them form a non-continuous range.


1 Answers

  1. When separated by a comma, simpler is better. Avoid referencing subsets within the same rectangular block that Excel would create if you only selected the upper left and bottom right of the region in question.

  2. No, there are not too many practical exercises that warrant this additional complexity (when referencing cells separated by a colon only). I read a comment/proposed response by someone who suggested there is use, depending on the data structure. I advise that individual seek a better format/ layout of data before using an overly complex referencing regime/strategy

  3. RE: separation by comma: Yes, there are numerous practical and valid exercises that are being managed in suitable fashion

Background / detail

You have asked two questions: i.e. pertaining to 1) validity of 'odd referencing' and 2) potential use (paraphrased).

Re: validity - going back to basics, when you sum the whole are you are simply referencing the upper most left and bottom most right cells - Excel automatically creates a 'perimeter' to encompass any and all cells separated by a colon. This perimeter is always constructed to form a rectangle (the square being a special type of rectangle). You can actually 'force' Excel to keep these cells separated initially, i.e. the first and second summations are identical (parametrically, and in terms of their result):

Figure 1) Usual summation of a rectangular enclosed block

Figure 2) Identical variation depicting upper left and bottom right cells

Probably the most important implication/consequence of this is as follows:

  • You cannot double-sum no matter how many 'sub-ranges', disparate / overlapping or otherwise, PROVIDED these are separated by a colon. For instance, the following gives exactly the same solution as the first 2 figures:

Figure 3) Union of cells within outer perimeter

However, for most practical applications, it would be rare and odd to require multi-subset referencing (especially when it comes to summation). I could envisage something like this when using various subsets that are separated with a comma.

Generally, parsimony/simplicity in formulation is required (for auditability, reviewability, validation, computationally speed, error mitigation, and so on).

I believe it would be quite challenging to concoct / fathom a scenario where referencing several subsets within (or across) tables of data served a cause that couldn't be achieved using the simple, straightforward referencing per Figure 1).

In fact, it would require a highly bespoke exercise/structure to justify something like the summation shown in Figure 3, and as a long-serving manager who reviews excel models almost on a daily basis, I would certainly advise against constructing a summation that referenced in this manner.

The more that I consider, it, multiple referencing of this manner impedes functionality. For instance, using the referencing style in Figures 1-2 allows one to take summation across multiple (adjacent) sheets (which I'm not a big fan of because any sheet that's placed between Sheets 4 and 2 will be included in the summation, which can lead to errors):

=SUM(Sheet4:Sheet2!A1:F5)

Where Sheets 2, 3, and 4 contain identical values to those you provided, which yields 945 (=315 x 3) as one would expect. Doing this for version 3 results in #Value! (not even Excel likes it!).

Now, in terms of separation by commas, this is an entirely different thing but involves similar 'mechanics' under the hood. Excel will only 'perimeter-fence' cells separated by a colon, and it will create several 'ring-fenced' rectangular perimeters, one for each group of (colon-separated) cells that are separated by a comma.

This 'setup' has a number of practical uses: it allows non-contiguous groups of cells to be summed; different conditions can be placed on each e.g. here's a function that sums all even numbers on the left hand side, and all odd numbers on the right:

=SUM(A1:C5*(MOD(A1:C5,2)=0),D1:F5*(MOD(D1:F5,2)<>0))

This yields 99 (if I haven't mixed up the mod divisor/operand:). PS - at this point I realised I copied your values incorrectly and inadvertently included the row labels as a column. So the 99 is based upon the 'correct' values which reconciles to results you've shown.

Of course, I could take the sum of the left & right side (without any such conditions, or with some condition that will also be satisfied, e.g. mod(a,b) < 0), where each half is separated by a comma. Of course this would yield the exact same overall result as summing the entire region like shown in Figures 1-3.

So, when separated by a comma, Excel treats this exactly the same as Figure 1-3 BUT it only does so in respect of each 'subset' of 'colon' separated cells. Here, duplication/overlapping or omitted ranges do count / make a difference.

There is no 'odd' form of referencing when it comes to colon or comma, there is good/best practice, but with the infinite different possibilities/data table constructs, initiatives and uses/models etc/ there will undoubtedly be some use for complicated forms of straightforward summations (when separated by a colon); I see these as being rare/far-between and advise against a complicated set up if it can be avoided.

However, I see greater application / purpose for a similar construct with 2/3 or more subsets separated by a comma. In fact, this effectively achieves the same outcome as the status bar message (with summation statistics selected to 'display', i.e. after right-clicking and selecting it):

Figure 4

like image 140
JB-007 Avatar answered Oct 18 '22 02:10

JB-007