Why does wrapping an Excel array formula in a SUM not produce the sum of the contents in this case?

StackOverflow https://stackoverflow.com/questions/808011

  •  03-07-2019
  •  | 
  •  

Question

Clever People,

In Excel, it appears that wrapping an array formula in a SUM does not produce the sum of the contents of the array in all cases. (That is, it appears to me; clearly, I could well be confused.)

This came up when trying to write the equivalent of an inner join in Excel. For reasons involving Excel for Mac, I'm doing this without using Visual Basic or MS Query: only array formulas and INDEX/MATCH. I have, I think, succeeded, apart from this bizarre behaviour that I am at a loss to understand.

Currently I am using a kludge, but it would be nice not to have to. Any help or advice would be very greatly appreciated.

The following reproduces the issue:

Table 1

key | fkey
----+------
  a |   x
  a |   y
  b |   x
  b |   y

Table 2

key | value
----+------
  x |   1
  y |  10

The layout above is supposed to represent how these two little tables appear in the spreadsheet: nothing clever is going on. The four obvious ranges are named as table1.key, table1.fkey, table2.key, and table2.value.

So, then, the inner join. I'd like those values from Table 2 that correspond to a specific key in Table 1, with the join on `table1.fkey=table2.key'. Entering the formula:

{=("a"=table1.key)*INDEX(table2.value, MATCH(table1.fkey, table2.key, FALSE))}

into two vertical cells gives the required result; namely:

1
10

(Cool, or what?) So far, so fine. But now I actually want the sum of the two numbers above. So I simply wrapped the formula above in a SUM, and entered the result in a single cell:

{=SUM(("a"=table1.key)*INDEX(table2.value, MATCH(table1.fkey, table2.key, FALSE)))}

Result?

2

Does anyone have any idea what on earth is going on?

James

Was it helpful?

Solution

The row_num argument of INDEX cannot be an array when used in a single cell array formula. The INDEX function is always returning 1. Try this

=SUM(INDEX(table2.value,{2,1}))

It will always return 10, because INDEX will only accept the first element of the array. I don't know why they work differently for single cell array formulas vs multicell.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top