Videos
Because this is essentially a duplicate, I address a few issues that are do not explicitly overlap the related question or answer:
If a class has cumulative frequency .5, then the median is at the boundary of that class and the next larger one.
If $N$ is large (really the only case where this method is generally successful), there is little difference between $N/2$ and $(N+1)/2$ in the formula. All references I checked use $N/2$.
Before computers were widely available, large datasets were customarily reduced to categories (classes) and plotted as histograms. Then the histograms were used to approximate the mean, variance, median, and other descriptive measures. Nowadays, it is best just to use a statistical computer package to find exact values of all measures.
One remaining application is to try to re-claim the descriptive measures from grouped data or from a histogram published in a journal. These are cases in which the original data are no longer available.
This procedure to approximate the sample median from grouped data $assumes$ that data are distributed in roughly a uniform fashion throughout the median interval. Then it uses interpolation to approximate the median. (By contrast, methods to approximate the sample mean and sample variance from grouped data one assumes that all obseervations are concentrated at their class midpoints.)
According to what I learned the class where the median is located is the lowest class for which the cumulative frequency equals or exceeds $\frac N2$
Therefore, the median class would be in 30-40. which would give 30.833 approximately as you said 31.
Each row is a separate dataset (up to 150 rows in a spreadsheet). The columns give the frequency in each group. I can manually find the median class and calculate the median for each row (albeit with some difficulty). But would like to make it a more automatic procedure.
I hope the screen shot below helps.
I find it amazing that noone has suggested aggregate yet, seeing as it is the simple, base R function included for these sorts of tasks. E.g.:
aggregate(. ~ genotype, data=dat, FUN=median)
# genotype DIV3 DIV4
#1 HET 1.4 3.20
#2 WT 23.9 25.25
I found ddply to be the best for this.
medians = ddply(a, .(genotype), numcolwise(median))