How to calculate median from a frequency table?
How to calculate a median when the items are in various quantitates
Excel pivot table - median
excel - How do you figure out the MEDIAN of a column taking into account filters? - Stack Overflow
Videos
I'm working with age data. Age groups in Column A (0, 1, 2, 3 .... 100+), and frequency in Column B. I'd like a simple way of calculating median age.
I thought I had the solution already. Add cumulative frequency in column C, starting with 0 in C2, entering =B2 into C3, =B3+C3 into C4, =B4+C4 into C5, and so on. Enter the position of the median into cell E3 by using =(C103 + 1) / 2 (where C103 equals the total cumulative frequency). Then, finding the median itself by using =LOOKUP(E3, C2:C103, A2:A102).
Whilst that solution seems appropriate in the vast majority of cases, it occurred to me that a particular median position might fall between two categories. For example, we might get the median position of 100.5, where person in Position 100 is aged 30 and person in Position 101 is aged 31. If we work through that manually we can calculate that the median age should be 30.5, but if I'm not mistaken the process I've described above would return the median age as 31 in that case. I'll need to repeat this for a large number of datasets and won't be able to check them all manually, so I think the formula itself needs to be changed, but this is where my Excel skills fall short.
Does anybody know how I can update the solution above to calculate the median correctly, including any cases in which the median position falls between two categories?
I'm thinking, I could build an ugly array and do this in ugly fashion
but I'm also thinking, there must be elegance already on the topic... It had to of come up before...
with pricing merchandise, I can build manual averages by adding all the quantities and the total of all the item costs..
if I have various quantities and prices for each quantity
how do I pretend I have a list of all the items as individual items- and find the median price point...
Updated to pick up comments from lori_m below
1. Original answer - all xl versions
Courtesy of Aladin Akyurek from this solution
If your data was in A1:A10, array enter this formula with shiftctrlenter
=MEDIAN(IF(SUBTOTAL(3,OFFSET(A1:A10,ROW(A1:A10)-MIN(ROW(A1:A10)),,1)),A1:A10))
2. Updated answer (non-array) - all xl versions
=MEDIAN(IF(SUBTOTAL(3,OFFSET(A1:A10,MMULT(ROW(A1:A10),1)-MIN(MMULT(ROW(A1:A10),1)),,1)),A1:A10))
3. Excel 2010 & Excel 2013
=AGGREGATE(12,5,A1:A10)
True, SUBTOTAL does not include MEDIAN but you can obtain it by first using the SUBTOTAL and then substituting a part of the Excel formula in several or more columns/cells where you need the MEDIAN.
Imagine you used AVERAGE in the Excel subtotal function, then the formula would be for instance =SUBTOTAL(1;E10:E12)
Now, replace all occurrences of "SUBTOTAL(1;" with "MEDIAN(" (without the quotes).
Avoid the AND function, use nested IFs instead.
Specifically:
=MEDIAN(IF(A2:A21=1,IF(B2:B21="x",C2:C21)))
The issue is that AND doesn't work as you intend in an array context -- it doesn't AND each pair of elements and produce an array, it ANDs all the elements of both arrays to give a single scalar result.
Your original formula is evaluating the entire AND call to a single output of "FALSE" (because it's not the case that every element in both arrays satisfies the comparison).
I don't know why it works, but after messing around I found the following accomplishes my goal:
{=MEDIAN(IF((A2:A21=1)*(B2:B21="x"),C2:C21))} #gives correct 15
(Again, remembering to process it as an array using ctrl + shft + enter)
My guess is that the AND function is only producing a single TRUE or FALSE by default, and therefore can't work in this context (which requires producing an array of 1s and 0s). However, if I process each column as it's own array, I'm producing a string of 1s and 0s for each subcondition. If I then multiply multiple conditional arrays, the result is itself an array of 1s and 0s in which 1s only exist when the desired condition were true for both respective subconditions.
I'm sure someone can confirm/explain this in a comment.