Since you already know the formula, it should be easy enough to create a function to do the calculation for you.

Here, I've created a basic function to get you started. The function takes four arguments:

  • frequencies: A vector of frequencies ("number" in your first example)
  • intervals: A 2-row matrix with the same number of columns as the length of frequencies, with the first row being the lower class boundary, and the second row being the upper class boundary. Alternatively, "intervals" may be a column in your data.frame, and you may specify sep (and possibly, trim) to have the function automatically create the required matrix for you.
  • sep: The separator character in your "intervals" column in your data.frame.
  • trim: A regular expression of characters that need to be removed before trying to coerce to a numeric matrix. One pattern is built into the function: trim = "cut". This sets the regular expression pattern to remove (, ), [, and ] from the input.

Here's the function (with comments showing how I used your instructions to put it together):

GroupedMedian <- function(frequencies, intervals, sep = NULL, trim = NULL) {
  # If "sep" is specified, the function will try to create the 
  #   required "intervals" matrix. "trim" removes any unwanted 
  #   characters before attempting to convert the ranges to numeric.
  if (!is.null(sep)) {
    if (is.null(trim)) pattern <- ""
    else if (trim == "cut") pattern <- "\\[|\\]|\\(|\\)"
    else pattern <- trim
    intervals <- sapply(strsplit(gsub(pattern, "", intervals), sep), as.numeric)
  }

  Midpoints <- rowMeans(intervals)
  cf <- cumsum(frequencies)
  Midrow <- findInterval(max(cf)/2, cf) + 1
  L <- intervals[1, Midrow]      # lower class boundary of median class
  h <- diff(intervals[, Midrow]) # size of median class
  f <- frequencies[Midrow]       # frequency of median class
  cf2 <- cf[Midrow - 1]          # cumulative frequency class before median class
  n_2 <- max(cf)/2               # total observations divided by 2

  unname(L + (n_2 - cf2)/f * h)
}

Here's a sample data.frame to work with:

mydf <- structure(list(salary = c("1500-1600", "1600-1700", "1700-1800", 
    "1800-1900", "1900-2000", "2000-2100", "2100-2200", "2200-2300", 
    "2300-2400", "2400-2500"), number = c(110L, 180L, 320L, 460L, 
    850L, 250L, 130L, 70L, 20L, 10L)), .Names = c("salary", "number"), 
    class = "data.frame", row.names = c(NA, -10L))
mydf
#       salary number
# 1  1500-1600    110
# 2  1600-1700    180
# 3  1700-1800    320
# 4  1800-1900    460
# 5  1900-2000    850
# 6  2000-2100    250
# 7  2100-2200    130
# 8  2200-2300     70
# 9  2300-2400     20
# 10 2400-2500     10

Now, we can simply do:

GroupedMedian(mydf$number, mydf$salary, sep = "-")
# [1] 1915.294

Here's an example of the function in action on some made up data:

set.seed(1)
x <- sample(100, 100, replace = TRUE)
y <- data.frame(table(cut(x, 10)))
y
#           Var1 Freq
# 1   (1.9,11.7]    8
# 2  (11.7,21.5]    8
# 3  (21.5,31.4]    8
# 4  (31.4,41.2]   15
# 5    (41.2,51]   13
# 6    (51,60.8]    5
# 7  (60.8,70.6]   11
# 8  (70.6,80.5]   15
# 9  (80.5,90.3]   11
# 10  (90.3,100]    6

### Here's GroupedMedian's output on the grouped data.frame...
GroupedMedian(y$Freq, y$Var1, sep = ",", trim = "cut")
# [1] 49.49231

### ... and the output of median on the original vector
median(x)
# [1] 49.5

By the way, with the sample data that you provided, where I think there was a mistake in one of your ranges (all were separated by dashes except one, which was separated by a comma), since strsplit uses a regular expression by default to split on, you can use the function like this:

x<-c(110,180,320,460,850,250,130,70,20,10)
colnames<-c("numbers")
rownames<-c("[1500-1600]","(1600-1700]","(1700-1800]","(1800-1900]",
            "(1900-2000]"," (2000,2100]","(2100-2200]","(2200-2300]",
            "(2300-2400]","(2400-2500]")
y<-matrix(x,nrow=length(x),dimnames=list(rownames,colnames))
GroupedMedian(y[, "numbers"], rownames(y), sep="-|,", trim="cut")
# [1] 1915.294
Answer from A5C1D2H2I1M1N2O1R2T1 on Stack Overflow
🌐
BYJUS
byjus.com › maths › median-of-grouped-data
Median of Grouped Data
... The formula to find the median of grouped data is: Median = l+ [((n/2) – cf)/f] × h Where l = lower limit of median class, n = number of observations, h = class size, f = frequency of median class, cf = cumulative frequency of class preceding ...
Published   June 16, 2022
Views   34K
People also ask

What is the formula to find the median of grouped data?
The formula to find the median of grouped data is: · Median = l+ [((n/2) – cf)/f] × h · Where l = lower limit of median class, n = number of observations, h = class size, f = frequency of median class, cf = cumulative frequency of class preceding the median class.
🌐
byjus.com
byjus.com › maths › median-of-grouped-data
Median of Grouped Data
What is meant by the median in statistics?
In statistics, the median is the middle value of the given dataset.
🌐
byjus.com
byjus.com › maths › median-of-grouped-data
Median of Grouped Data
What is the median class?
The median class is the class interval whose cumulative frequency is greater than (and nearest to) n/2.
🌐
byjus.com
byjus.com › maths › median-of-grouped-data
Median of Grouped Data
Top answer
1 of 6
7

Since you already know the formula, it should be easy enough to create a function to do the calculation for you.

Here, I've created a basic function to get you started. The function takes four arguments:

  • frequencies: A vector of frequencies ("number" in your first example)
  • intervals: A 2-row matrix with the same number of columns as the length of frequencies, with the first row being the lower class boundary, and the second row being the upper class boundary. Alternatively, "intervals" may be a column in your data.frame, and you may specify sep (and possibly, trim) to have the function automatically create the required matrix for you.
  • sep: The separator character in your "intervals" column in your data.frame.
  • trim: A regular expression of characters that need to be removed before trying to coerce to a numeric matrix. One pattern is built into the function: trim = "cut". This sets the regular expression pattern to remove (, ), [, and ] from the input.

Here's the function (with comments showing how I used your instructions to put it together):

GroupedMedian <- function(frequencies, intervals, sep = NULL, trim = NULL) {
  # If "sep" is specified, the function will try to create the 
  #   required "intervals" matrix. "trim" removes any unwanted 
  #   characters before attempting to convert the ranges to numeric.
  if (!is.null(sep)) {
    if (is.null(trim)) pattern <- ""
    else if (trim == "cut") pattern <- "\\[|\\]|\\(|\\)"
    else pattern <- trim
    intervals <- sapply(strsplit(gsub(pattern, "", intervals), sep), as.numeric)
  }

  Midpoints <- rowMeans(intervals)
  cf <- cumsum(frequencies)
  Midrow <- findInterval(max(cf)/2, cf) + 1
  L <- intervals[1, Midrow]      # lower class boundary of median class
  h <- diff(intervals[, Midrow]) # size of median class
  f <- frequencies[Midrow]       # frequency of median class
  cf2 <- cf[Midrow - 1]          # cumulative frequency class before median class
  n_2 <- max(cf)/2               # total observations divided by 2

  unname(L + (n_2 - cf2)/f * h)
}

Here's a sample data.frame to work with:

mydf <- structure(list(salary = c("1500-1600", "1600-1700", "1700-1800", 
    "1800-1900", "1900-2000", "2000-2100", "2100-2200", "2200-2300", 
    "2300-2400", "2400-2500"), number = c(110L, 180L, 320L, 460L, 
    850L, 250L, 130L, 70L, 20L, 10L)), .Names = c("salary", "number"), 
    class = "data.frame", row.names = c(NA, -10L))
mydf
#       salary number
# 1  1500-1600    110
# 2  1600-1700    180
# 3  1700-1800    320
# 4  1800-1900    460
# 5  1900-2000    850
# 6  2000-2100    250
# 7  2100-2200    130
# 8  2200-2300     70
# 9  2300-2400     20
# 10 2400-2500     10

Now, we can simply do:

GroupedMedian(mydf$number, mydf$salary, sep = "-")
# [1] 1915.294

Here's an example of the function in action on some made up data:

set.seed(1)
x <- sample(100, 100, replace = TRUE)
y <- data.frame(table(cut(x, 10)))
y
#           Var1 Freq
# 1   (1.9,11.7]    8
# 2  (11.7,21.5]    8
# 3  (21.5,31.4]    8
# 4  (31.4,41.2]   15
# 5    (41.2,51]   13
# 6    (51,60.8]    5
# 7  (60.8,70.6]   11
# 8  (70.6,80.5]   15
# 9  (80.5,90.3]   11
# 10  (90.3,100]    6

### Here's GroupedMedian's output on the grouped data.frame...
GroupedMedian(y$Freq, y$Var1, sep = ",", trim = "cut")
# [1] 49.49231

### ... and the output of median on the original vector
median(x)
# [1] 49.5

By the way, with the sample data that you provided, where I think there was a mistake in one of your ranges (all were separated by dashes except one, which was separated by a comma), since strsplit uses a regular expression by default to split on, you can use the function like this:

x<-c(110,180,320,460,850,250,130,70,20,10)
colnames<-c("numbers")
rownames<-c("[1500-1600]","(1600-1700]","(1700-1800]","(1800-1900]",
            "(1900-2000]"," (2000,2100]","(2100-2200]","(2200-2300]",
            "(2300-2400]","(2400-2500]")
y<-matrix(x,nrow=length(x),dimnames=list(rownames,colnames))
GroupedMedian(y[, "numbers"], rownames(y), sep="-|,", trim="cut")
# [1] 1915.294
2 of 6
4

I've written it like this to clearly explain how it's being worked out. A more compact version is appended.

library(data.table)

#constructing the dataset with the salary range split into low and high
salarydata <- data.table(
  salaries_low = 100*c(15:24),
  salaries_high = 100*c(16:25),
  numbers = c(110,180,320,460,850,250,130,70,20,10)
)

#calculating cumulative number of observations
salarydata <- salarydata[,cumnumbers := cumsum(numbers)]
salarydata
   # salaries_low salaries_high numbers cumnumbers
   # 1:         1500          1600     110        110
   # 2:         1600          1700     180        290
   # 3:         1700          1800     320        610
   # 4:         1800          1900     460       1070
   # 5:         1900          2000     850       1920
   # 6:         2000          2100     250       2170
   # 7:         2100          2200     130       2300
   # 8:         2200          2300      70       2370
   # 9:         2300          2400      20       2390
   # 10:         2400          2500      10       2400

#identifying median group
mediangroup <- salarydata[
  (cumnumbers - numbers) <= (max(cumnumbers)/2) & 
  cumnumbers >= (max(cumnumbers)/2)]
mediangroup
   # salaries_low salaries_high numbers cumnumbers
   # 1:         1900          2000     850       1920

#creating the variables needed to calculate median
mediangroup[,l := salaries_low]
mediangroup[,h := salaries_high - salaries_low]
mediangroup[,f := numbers]
mediangroup[,c := cumnumbers- numbers]
n = salarydata[,sum(numbers)]

#calculating median
median <- mediangroup[,l + ((h/f)*((n/2)-c))]
median
   # [1] 1915.294

The compact version -

EDIT: Changed to a function at @AnandaMahto's suggestion. Also, using more general variable names.

library(data.table)

#Creating function

CalculateMedian <- function(
   LowerBound,
   UpperBound,
   Obs
)
{
   #calculating cumulative number of observations and n
   dataset <- data.table(UpperBound, LowerBound, Obs)

   dataset <- dataset[,cumObs := cumsum(Obs)]
   n = dataset[,max(cumObs)]

   #identifying mediangroup and dynamically calculating l,h,f,c. We already have n.
   median <- dataset[
      (cumObs - Obs) <= (max(cumObs)/2) & 
      cumObs >= (max(cumObs)/2),

      LowerBound + ((UpperBound - LowerBound)/Obs) * ((n/2) - (cumObs- Obs))
   ]

   return(median)
}


# Using function
CalculateMedian(
  LowerBound = 100*c(15:24),
  UpperBound = 100*c(16:25),
  Obs = c(110,180,320,460,850,250,130,70,20,10)
)
# [1] 1915.294
🌐
Reddit
reddit.com › r/excel › is there a single formula for calculating median of grouped data for multiple datasets ?
r/excel on Reddit: Is there a single formula for Calculating Median of Grouped Data for multiple datasets ?
August 11, 2022 -

Each row is a separate dataset (up to 150 rows in a spreadsheet). The columns give the frequency in each group. I can manually find the median class and calculate the median for each row (albeit with some difficulty). But would like to make it a more automatic procedure.

I hope the screen shot below helps.

Top answer
1 of 6
1
u/AussieRuth - Your post was submitted successfully. Once your problem is solved, reply to the answer(s) saying Solution Verified to close the thread. Follow the submission rules -- particularly 1 and 2. To fix the body, click edit. To fix your title, delete and re-post. Include your Excel version and all other relevant information Failing to follow these steps may result in your post being removed without warning. I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.
2 of 6
1
With grouped data, you have the x-value and its frequency, eg if the data are: 0 0 1 2 3 3 4 4 4 , you could write it as x 0 1 2 3 4 f 2 1 1 2 3 The median is found by finding the half-way point ie the 5th point, ie x=3. If the frequencies are fairly large eg in the 100s, then you cumulatively add until you get to half-way. If, instead of simple x- values, you have say age-ranges or income-brackets, then you can only find the median class (bracket) this way. There is a further adjustment to get an actual numerical estimate of the median. My median brackets are different for each row, so the xl formulae for the adjustment is different for each row. And I have to physically change the formula for each row. That's what I would like to do automatically. eg in row 17, the formulae use columns W and X to calculate median in row 18, the formulae use columns U and V to calculate median Not sure if this is what you wanted to know about need to automate.
🌐
Math is Fun
mathsisfun.com › data › frequency-grouped-mean-median-mode.html
Mean, Median and Mode from Grouped Frequencies
59, 65, 61, 62, 53, 55, 60, 70, ... 59 + 68 + 61 + 6721 Mean = 61.38095... To find the Median Alex places the numbers in value order and finds the middle number....
🌐
Reddit
reddit.com › r/askmath › when calculating the median of grouped data, why do we strictly choose the nearest class with cumulative frequency higher than n/2?
r/askmath on Reddit: When calculating the median of grouped data, why do we strictly choose the nearest class with cumulative frequency higher than N/2?
October 24, 2021 -

Say we're calculating the median of grouped data and the value of N/2 is found to be 170. If the class 30-40 has a cumulative frequency of 169.5, and the class 40-50 has a cumulative frequency of 180, we choose the class 40-50 as the median class.

Why do we do that, even though class 30-40 is clearly closer to it? Why can't it be the class with the closest cumulative frequency to it?

Top answer
1 of 1
1
Hello NAFISA, I see that you are trying to calculate the median of grouped data using MATLAB's median function. However, the 'median' function in MATLAB is designed for raw data inputs and does not directly compute the median for grouped data. you can expand your grouped data as follows: class_intervals = [0 5; 5 10; 10 15; 15 20; 20 25; 25 30; 30 35; 35 40; 40 45; 45 50]; frequencies = [14, 8, 20, 7, 11, 10, 5, 16, 21, 9]; midpoints = (class_intervals(:, 1) + class_intervals(:, 2)) / 2; expanded_data = []; for i = 1:length(frequencies) expanded_data = [expanded_data, repmat(midpoints(i), 1, frequencies(i))]; end median_value = median(expanded_data); disp(['The median is: ', num2str(median_value)]); Using this method, you will find that the output is 27.5 instead of the expected 25.25. This discrepancy occurs because: When you expand grouped data into individual data points, you assume that all data points within a class interval are located at the midpoint of that interval. For example, if a class interval is [20, 25] with a frequency of 11, you assume there are 11 data points all exactly at 22.5. This assumption can lead to inaccuracies because the actual data points could be spread across the entire interval [20, 25]. So, as Muskan mentioned , you can use the grouped data median formula (L + ((N/2 - cf) / f) * h). If you frequently work with frequency distribution tables and find it cumbersome to use this formula manually, you can use MATLAB functions to simplify the process. Here is an example: class_intervals1 = [0 5; 5 10; 10 15; 15 20; 20 25; 25 30; 30 35; 35 40; 40 45; 45 50]; frequencies1 = [14, 8, 20, 7, 11, 10, 5, 16, 21, 9]; class_intervals2 = [420 430; 430 440; 440 450; 450 460; 460 470; 470 480; 480 490; 490 500]; frequencies2 = [336, 2112, 2336, 1074, 1553, 1336, 736, 85]; % Calculate the median using the custom function median_value1 = groupedMedian(class_intervals1, frequencies1); median_value2 = groupedMedian(class_intervals2, frequencies2); disp(['The median of the grouped data1 is: ', num2str(median_value1)]); disp(['The median of the grouped data2 is: ', num2str(median_value2)]); function median_value = groupedMedian(class_intervals, frequencies) % Calculate cumulative frequency cum_frequencies = cumsum(frequencies); % Total number of observations N = sum(frequencies); % Find the median class (first class where cumulative frequency >= N/2) median_class_index = find(cum_frequencies >= N/2, 1); % Extract the median class boundaries and frequency L = class_intervals(median_class_index, 1); f = frequencies(median_class_index); CF = cum_frequencies(median_class_index - 1); if isempty(CF) CF = 0; end h = class_intervals(median_class_index, 2) - L; % Calculate the median median_value = L + ((N/2 - CF) / f) * h; end You can also try referring to these file exchange functions which might help you https://www.mathworks.com/matlabcentral/fileexchange/38238-gmedian https://www.mathworks.com/matlabcentral/fileexchange/38228-gprctile I hope this helps you moving forward
Find elsewhere
🌐
SMath
smath.com › en-US › forum › topic › TY7Zc6 › how-to-calculate-median-of-grouped-data-if-group-size-is-variable
how to calculate median of grouped data if group size is variable - SMath
I learned in school that Median = L + (n/2-cf)*h/f where L = lower limit of median class n = no. of observations cf = cumulative frequency of class preceding the median class, f = frequency of median class, h = class size (assuming class size to be equal). I used to use this formula for grouped ...
Top answer
1 of 3
2

Because this is essentially a duplicate, I address a few issues that are do not explicitly overlap the related question or answer:

If a class has cumulative frequency .5, then the median is at the boundary of that class and the next larger one.

If is large (really the only case where this method is generally successful), there is little difference between and in the formula. All references I checked use .

Before computers were widely available, large datasets were customarily reduced to categories (classes) and plotted as histograms. Then the histograms were used to approximate the mean, variance, median, and other descriptive measures. Nowadays, it is best just to use a statistical computer package to find exact values of all measures.

One remaining application is to try to re-claim the descriptive measures from grouped data or from a histogram published in a journal. These are cases in which the original data are no longer available.

This procedure to approximate the sample median from grouped data $assumes$ that data are distributed in roughly a uniform fashion throughout the median interval. Then it uses interpolation to approximate the median. (By contrast, methods to approximate the sample mean and sample variance from grouped data one assumes that all obseervations are concentrated at their class midpoints.)

2 of 3
0

According to what I learned the class where the median is located is the lowest class for which the cumulative frequency equals or exceeds

Therefore, the median class would be in 30-40. which would give 30.833 approximately as you said 31.

🌐
BBC
bbc.co.uk › bitesize › guides › zwhgk2p › revision › 7
Averages from a grouped table - Analysing data - Edexcel - GCSE Maths Revision - Edexcel - BBC Bitesize
February 13, 2023 - If data is organised into groups, we do not know the exact value of each item of data, just which group it belongs to. This means that we cannot find the exact value for the modeclosemodeAn average found by selecting the most commonly occurring value. There can be more than one mode and there can also be no mode., medianclosemedianThe median is the middle value.
🌐
GeeksforGeeks
geeksforgeeks.org › mathematics › median-of-grouped-data
Median of Grouped Data: Formula, How to Find, and Solved Examples - GeeksforGeeks
July 23, 2025 - To find median of ungrouped data, one can simply sort the data points in ascending order. In case of odd number of observations, the middle value would be the median. On the other hand , for even number of observations, one can take mean of the two middle values to find the median. But there is a different method to find median of grouped data discussed later in this article.
🌐
Microsoft Community
community.fabric.microsoft.com › t5 › DAX-Commands-and-Tips › How-to-calculate-median-for-grouped-data-without-making-new › m-p › 3823113
Solved: How to calculate median for grouped data without m... - Microsoft Fabric Community
April 10, 2024 - I got this original table: I need to calculate median values for selling_eur. How can I do it? I got only some ideas. Transform table with duplicated rows for calculation. It will not work fast. Some how add new column, thich contains arrays with duplicated prices for calculation. I got no id...
🌐
The Math Doctors
themathdoctors.org › finding-the-median-of-grouped-data
Finding the Median of Grouped Data – The Math Doctors
Derivation of Linear Interpolation Median Formula Median, m = L + [ (N/2 – F) / f ]C. How does this median formula come? My teacher did not show and proof how does this formula come. Therefore, I just substitute and blindly use the formula. Can you help me?
🌐
Microsoft Support
support.microsoft.com › en-us › office › calculate-the-median-of-a-group-of-numbers-2e3ec1aa-5046-4b4b-bfc4-4266ecf39bf9
Calculate the median of a group of numbers - Microsoft Support
In the Formula Builder pane, type MEDIAN in the Search box, and then select Insert Function. Make sure the cell span in the Number1 box matches your data (In this case, A1:A7). For this example, the answer that appears in the cell should be 8. Tip: To switch between viewing the results and viewing the formulas that return the results, press CTRL+` (grave accent), or on the Formulas tab, in the Formula Auditing group, select the Show Formulas button.
🌐
Quora
quora.com › How-can-I-find-a-median-in-grouped-data-if-it-is-odd
How to find a median in grouped data if it is odd - Quora
Answer (1 of 2): Find the cf(cumulative frequency) first ,then (N+1)÷2th term. Then, round up the exactly or gretear than (N+1)÷2th term. Finally, in the rounded number under Xi will be median
🌐
University of Massachusetts
people.umass.edu › biep540w › pdf › Grouped Data Calculation.pdf pdf
Lecture 2 – Grouped Data Calculation
– Grouped Data · Step 1: Construct the cumulative frequency distribution. Step 2: Decide the class that contain the median. Class Median is the first class with the value of cumulative · frequency equal at least n/2. Step 3: Find the median by using the following formula: M e d ia n · ...
🌐
Slideshare
slideshare.net › home › education › median of grouped data
Median of grouped data | PPTX
This document provides steps for ... frequencies, and cumulative frequencies. 2. Find the median class by calculating N/2, where N is the total number of data points....
🌐
AtoZMath
atozmath.com › example › StatsG.aspx
Median Example for grouped data
Median Example for grouped data - Median Example for grouped data, step by step online
🌐
CalculatorSoup
calculatorsoup.com › calculators › statistics › mean-median-mode.php
Mean, Median, Mode Calculator
November 4, 2025 - For the data set 1, 1, 2, 6, 6, ... to highest value, the median \( \widetilde{x} \) is the data point separating the upper half of the data values from the lower half....
🌐
Enterprise DNA
forum.enterprisedna.co › dax › dax calculations
Calculate median on a grouped column - DAX Calculations - Enterprise DNA Forum
February 8, 2021 - Hi. I have a calculation that I need to solve and I’m not quite sure how to proceed. Here is a sample of my data. Data Sample.xlsx (8.8 KB) I need to group the ‘Identifier’ where ‘06 Mos Post-ALC ERs’ = 1 I used the following DAX to generate the start of code necessary to calculate the median of a grouped column.