Question

I'm trying to calculate the conditional median of a chart that looks like this:

A  |  B
-------
x  |  1
x  |  1
x  |  3
x  |  
y  |  4
z  |  5

I'm using MS Excel 2007. I am aware of the AVERAGEIF() statement, but there is no equivalent for Median. The main trick is that there are rows with no data - such as the 4th "a" above. In this case, I don't want this row considered at all in the calculations.

Googling has suggested the following, but Excel won't accept the formula format (maybe because it's 2007?)

=MEDIAN(IF((A:A="x")*(A:A<>"")), B:B)

Excel gives an error saying there is something wrong with my formula(something to do with the * in the condition) I had also tried the following, but it counts blank cells as 0's in the calculations:

=MEDIAN(IF(A:A = "x", B:B, "")

I am aware that those formulas return Excel "arrays", which means one must enter "Ctrl-shift-enter" to get it to work correctly.

How can I do a conditional evaluation and not consider blank cells?

Was it helpful?

Solution

Nested if statements.

=MEDIAN(IF(A:A = "x",IF(B:B<>"",B:B, ""),"")

Not much to explain - it checks if A is x. If it is, it checks if B is non-blank. Anything that matches both conditions gets calculated as part of the median.

Given the following data set:

A | B
------
x | 
x |     
x | 2
x | 3
x | 4
x | 5

The above formula returns 3.5, which is what I believe you wanted.

OTHER TIPS

Use the Googled formula, but instead of hitting Enter after you type it into the formula bar, hit Ctrl+Shift+Enter simultaneously (instead of Enter). This places brackets around the formula and will treat it as an array.

Be warned, if you edit it, you cannot hit Enter again or the formula will not be valid. If editing, you must do the same thing when done (Ctrl+Shift+Enter).

There is another way that does not involve the array formula that requires the CtrlShiftEnter operation. It uses the Aggregate() function offered in Excel 2010, 2011 and beyond. The method also works for min,max and various percentiles. The Aggregate() allows errors to be ignored, so the trick is to make all values that are not required cause errors. The easiest way is to do the task set above is:

=Aggregate(16,6,(B:B)/((A:A = "x")*(B:B<>"")),0.5)

The first and last parameters set the scene to do a percentile 50%, which is a median, the second says ignore all errors (including DIV#0) and the third says select the B column data, and divide it by a number which is one for all non empty values that have an x in the A column, and a zero otherwise. The zeros create a divide by zero exception and will be ignored because a/1=a and a/0=Div#0

The technique works for quartiles (with an appropriate p value), all other percentiles of course, and for max and min using the large or small function with appropriate arguments.

This is a similar construct to the Sumproduct() tricks that are so popular, but which cannot be used on any quantiles or max min values as it produces zeros which look like numbers to these functions.

Bob Jordan

Perhaps to generalize it a little more, instead of this...

{=MEDIAN(IF(A:A="x",IF(B:B<>"",B:B)))}

... you could use the following:

{=QUARTILE.EXC(IF(A:A="x",IF(B:B<>"",B:B)),2)}

Note that the curly brackets refer to an array formula; you should not place the brackets in your formula but press CTRL+SHIFT+ENTER (or CMD+SHIFT+ENTER on macOS) when entering the formula

Then you could easily get the first and third quartile by altering the last number from 2 to 1 or 3 respectively. QUARTILE.EXC is what most commercial statistical software (e.g. Minitab) use by the way. The "regular" function is QUARTILE.INC, or for the older versions of Excel, just QUARTILE.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top