To find the percentile of a value relative to an array (or in your case a dataframe column), use the scipy function stats.percentileofscore().

For example, if we have a value x (the other numerical value not in the dataframe), and a reference array, arr (the column from the dataframe), we can find the percentile of x by:

from scipy import stats
percentile = stats.percentileofscore(arr, x)

Note that there is a third parameter to the stats.percentileofscore() function that has a significant impact on the resulting value of the percentile, viz. kind. You can choose from rank, weak, strict, and mean. See the docs for more information.

For an example of the difference:

>>> df
   a
0  1
1  2
2  3
3  4
4  5

>>> stats.percentileofscore(df['a'], 4, kind='rank')
80.0

>>> stats.percentileofscore(df['a'], 4, kind='weak')
80.0

>>> stats.percentileofscore(df['a'], 4, kind='strict')
60.0

>>> stats.percentileofscore(df['a'], 4, kind='mean')
70.0

As a final note, if you have a value that is greater than 80% of the other values in the column, it would be in the 80th percentile (see the example above for how the kind method affects this final score somewhat) not the 20th percentile. See this Wikipedia article for more information.

Answer from wingr on Stack Overflow
Top answer
1 of 5
45

To find the percentile of a value relative to an array (or in your case a dataframe column), use the scipy function stats.percentileofscore().

For example, if we have a value x (the other numerical value not in the dataframe), and a reference array, arr (the column from the dataframe), we can find the percentile of x by:

from scipy import stats
percentile = stats.percentileofscore(arr, x)

Note that there is a third parameter to the stats.percentileofscore() function that has a significant impact on the resulting value of the percentile, viz. kind. You can choose from rank, weak, strict, and mean. See the docs for more information.

For an example of the difference:

>>> df
   a
0  1
1  2
2  3
3  4
4  5

>>> stats.percentileofscore(df['a'], 4, kind='rank')
80.0

>>> stats.percentileofscore(df['a'], 4, kind='weak')
80.0

>>> stats.percentileofscore(df['a'], 4, kind='strict')
60.0

>>> stats.percentileofscore(df['a'], 4, kind='mean')
70.0

As a final note, if you have a value that is greater than 80% of the other values in the column, it would be in the 80th percentile (see the example above for how the kind method affects this final score somewhat) not the 20th percentile. See this Wikipedia article for more information.

2 of 5
5

Probably very late but still

df['column_name'].describe()

will give you the regular 25, 50 and 75 percentile with some additional data but if you want percentiles for some specific values then

df['column_name'].describe(percentiles=[0.1, 0.2, 0.3, 0.5])

This will give you 10th, 20th, 30th and 50th percentiles. You can give as many values as you want.

The resulting object can be accessed like a dict:

desc = df['column_name'].describe(percentiles=[0.1, 0.2, 0.3, 0.5])
print(desc)
print(desc['10%'])
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DataFrame.quantile.html
pandas.DataFrame.quantile — pandas 3.0.1 documentation
Value between 0 <= q <= 1, the quantile(s) to compute. axis{0 or ‘index’, 1 or ‘columns’}, default 0
Discussions

python - Find percentile stats of a given column - Stack Overflow
You can even give multiple columns with null values and get multiple quantile values (I use 95 percentile for outlier treatment) More on stackoverflow.com
🌐 stackoverflow.com
python - Finding the percentile in pandas column - Stack Overflow
Now I want to search through for ... of column 'D' and if the particular zone is below it add the row to a datagram. ... for the first city 'abc' and date 1/1/2020 we have three zones 'AA','CC' and 'DD' which have the corresponding 'D' column as 22,32 and 44. So the 10th percentile is 24 so first row gets appended to ... More on stackoverflow.com
🌐 stackoverflow.com
python - How do I get the percentile for a row in a pandas dataframe? - Stack Overflow
Please note that this strategy only makes sense if you want to query a large enough number of values. Otherwise the sorting is too expensive. ... In order to get the percentile of a column in pandas Dataframe with respect to another categorical column More on stackoverflow.com
🌐 stackoverflow.com
Percentile range output across multiple columns in python/pandas
df.groupby("type").agg("median") will get you the 50th percentile for "Hello" and "OK". I can't remember off the top of my head, but you can pass a list of aggregations to agg (such as mean, count, min) to get multiple columns as well as use custom functions. More on reddit.com
🌐 r/learnpython
2
3
February 6, 2021
🌐
Saturn Cloud
saturncloud.io › blog › how-to-find-percentile-stats-of-a-given-column-using-pandas
How to Find Percentile Stats of a Given Column Using Pandas | Saturn Cloud Blog
December 26, 2023 - The quantile() function is used to find the percentile statistics of a given column in a Pandas DataFrame. We can use this function to find any percentile, such as the median (50th percentile), first quartile (25th percentile), third quartile ...
🌐
Medium
medium.com › @amit25173 › understanding-percentiles-in-pandas-369166d19e76
Understanding Percentiles in Pandas | by Amit Yadav | Medium
March 6, 2025 - In these cases, pandas interpolates (estimates) the value. But what if you want more control over how it does that? Enter the interpolation parameter. ... # Using custom interpolation percentile = df_with_nan['Scores'].quantile(0.75, interpolation='nearest') print("75th Percentile (nearest interpolation):", percentile) ... Here, instead of estimating, pandas picks the nearest actual value in the dataset.
🌐
Statology
statology.org › home › how to calculate percentile rank in pandas (with examples)
How to Calculate Percentile Rank in Pandas (With Examples)
August 30, 2022 - Here’s how to interpret the values in the percent_rank column: 14.3% of the points values for team A are equal to or less than 2. 35.7% of the points values for team A are equal to or less than 5. 57.1% of the points values for team A are equal to or less than 7. ... How to Calculate Percent Change in Pandas How to Calculate Cumulative Percentage in Pandas How to Calculate Percentage of Total Within Group in Pandas
🌐
GeeksforGeeks
geeksforgeeks.org › python › percentile-rank-of-a-column-in-a-pandas-dataframe
Percentile rank of a column in a Pandas DataFrame - GeeksforGeeks
July 15, 2025 - Let us see how to find the percentile rank of a column in a Pandas DataFrame. We will use the rank() function with the argument pct = True to find the percentile rank.
🌐
TutorialsPoint
tutorialspoint.com › percentile-rank-of-a-column-in-a-pandas-dataframe
Percentile Rank of a Column in a Pandas DataFrame
July 25, 2023 - In the end, assign the resulting percentiles to a new column called 'Per_Rank' and display the result using 'print()' method. # importing packages import pandas as pd import numpy as np # defining a sample DataFrame using pandas data = {'Name': ['Ram', 'Shyam', 'Shrey', 'Mohan', 'Navya'], 'Score': [75, 82, 68, 90, 88] } df = pd.DataFrame(data) # Calculating the percentile rank using numpy df['Per_Rank'] = np.percentile(df['Score'], df['Score'], method = 'nearest') # to show the result print(df)
🌐
Statology
statology.org › home › how to calculate percentiles in python (with examples)
How to Calculate Percentiles in Python (With Examples)
November 3, 2020 - import numpy as np import pandas as pd #create DataFrame df = pd.DataFrame({'var1': [25, 12, 15, 14, 19, 23, 25, 29, 33, 35], 'var2': [5, 7, 7, 9, 12, 9, 9, 4, 14, 15], 'var3': [11, 8, 10, 6, 6, 5, 9, 12, 13, 16]}) #find 90th percentile of var1 column np.percentile(df.var1, 95) 34.1 · The following code shows how to find the 95th percentile value for a several columns in a pandas DataFrame:
Find elsewhere
Top answer
1 of 6
183
  • You can use the pandas.DataFrame.quantile() function.
    • If you look at the API for quantile(), you will see it takes an argument for how to do interpolation. If you want a quantile that falls between two positions in your data:
      • 'linear', 'lower', 'higher', 'midpoint', or 'nearest'.
      • By default, it performs linear interpolation.
      • These interpolation methods are discussed in the Wikipedia article for percentile
import pandas as pd
import numpy as np

# sample data 
np.random.seed(2023)  # for reproducibility
data = {'Category': np.random.choice(['hot', 'cold'], size=(10,)),
        'field_A': np.random.randint(0, 100, size=(10,)),
        'field_B': np.random.randint(0, 100, size=(10,))}
df = pd.DataFrame(data)

df.field_A.mean()  # Same as df['field_A'].mean()
# 51.1

df.field_A.median() 
# 50.0

# You can call `quantile(i)` to get the i'th quantile,
# where `i` should be a fractional number.

df.field_A.quantile(0.1)  # 10th percentile
# 15.6

df.field_A.quantile(0.5)  # same as median
# 50.0

df.field_A.quantile(0.9)  # 90th percentile
# 88.8

df.groupby('Category').field_A.quantile(0.1)
#Category
#cold    28.8
#hot      8.6
#Name: field_A, dtype: float64

df

  Category  field_A  field_B
0     cold       96       58
1     cold       22       28
2      hot       17       81
3     cold       53       71
4     cold       47       63
5      hot       77       48
6     cold       39       32
7      hot       69       29
8      hot       88       49
9      hot        3       49
2 of 6
39

assume series s

s = pd.Series(np.arange(100))

Get quantiles for [.1, .2, .3, .4, .5, .6, .7, .8, .9]

s.quantile(np.linspace(.1, 1, 9, 0))

0.1     9.9
0.2    19.8
0.3    29.7
0.4    39.6
0.5    49.5
0.6    59.4
0.7    69.3
0.8    79.2
0.9    89.1
dtype: float64

OR

s.quantile(np.linspace(.1, 1, 9, 0), 'lower')

0.1     9
0.2    19
0.3    29
0.4    39
0.5    49
0.6    59
0.7    69
0.8    79
0.9    89
dtype: int32
🌐
GeeksforGeeks
geeksforgeeks.org › python › calculate-arbitrary-percentile-on-pandas-groupby
Calculate Arbitrary Percentile on Pandas GroupBy - GeeksforGeeks
July 23, 2025 - The 90th percentile tells you the value below which 90% of the data lies. In Pandas, we can easily compute percentiles using the quantile() function. It can also be calculated using the NumPy percentile() function.
🌐
Skytowner
skytowner.com › explore › calculating_percentiles_of_a_dataframe_in_pandas
Calculating percentiles of a DataFrame in Pandas
Applying a function to multiple ... values within groups ... Master the mathematics behind data science with 100+ top-tier guides Start your free 7-days trial now! To calculate percentiles in Pandas, use the quantile(~) method. ... By default the quantile(~) method computes percentiles column-wise.
🌐
Enterprise DNA
blog.enterprisedna.co › pandas-percentile-calculate-percentiles-of-a-dataframe
Pandas Percentile: Calculate Percentiles of a Dataframe – Master Data Skills + AI
This will return the value below which 25% of the data in the Series falls. To find the 25th percentile (also known as the first quartile) in a Pandas DataFrame or Series, you can use the quantile() function with 0.25 as the argument. For instance, if you have a DataFrame df and you want to ...
🌐
datagy
datagy.io › home › pandas tutorials › data analysis in pandas › pandas quantile: calculate percentiles of a dataframe
Pandas Quantile: Calculate Percentiles of a Dataframe • datagy
April 16, 2023 - The q= argument accepts either a single number or an array of numbers that we want to calculate. If we wanted to calculate multiple percentiles, we simply pass in a list of values for the different percentiles we want to calculate.
🌐
IncludeHelp
includehelp.com › python › find-percentile-stats-of-a-given-column.aspx
Pandas: Find percentile stats of a given column
September 24, 2023 - # Importing pandas package import pandas as pd # Creating a dictionary d = { 'A':[90,72,56,76,82,34], 'B':[50,56,72,80,53,78] } # Creating a DataFrame df = pd.DataFrame(d,index=['a','b','c','d','e','f']) # Display original DataFrame print("Original DataFrame:\n",df,"\n") # Get multiple statistical result print("MEAN:\n",df['A'].mean(),"\n") print("MEAN:\n",df['A'].median(),"\n") # Calculating percentile using quantile print("10th Percentile:\n",df['A'].quantile(0.1),"\n") print("50th Percentile:\n",df['A'].quantile(0.5),"\n") print("90th Percentile:\n",df['A'].quantile(0.9),"\n")
🌐
Stack Overflow
stackoverflow.com › questions › 61616428 › finding-the-percentile-in-pandas-column
python - Finding the percentile in pandas column - Stack Overflow
you can use groupby.transform with quantile that will give row-wise the 10th percentile of the group, and then use loc to get only the rows where the value in D is lower than or equal (le) to this 10th percentile value. print (df.loc[df['D'...
Top answer
1 of 3
48

TL; DR

Use

sz = temp['INCOME'].size-1
temp['PCNT_LIN'] = temp['INCOME'].rank(method='max').apply(lambda x: 100.0*(x-1)/sz)

   INCOME    PCNT_LIN
0      78   44.444444
1      38   11.111111
2      42   22.222222
3      48   33.333333
4      31    0.000000
5      89   55.555556
6      94   66.666667
7     102   77.777778
8     122  100.000000
9     122  100.000000

Answer

It is actually very simple, once your understand the mechanics. When you are looking for percentile of a score, you already have the scores in each row. The only step left is understanding that you need percentile of numbers that are less or equal to the selected value. This is exactly what parameters kind='weak' of scipy.stats.percentileofscore() and method='average' of DataFrame.rank() do. In order to invert it, run Series.quantile() with interpolation='lower'.

So, the behavior of the scipy.stats.percentileofscore(), Series.rank() and Series.quantile() is consistent, see below:

In[]:
temp = pd.DataFrame([  78, 38, 42, 48, 31, 89, 94, 102, 122, 122], columns=['INCOME'])
temp['PCNT_RANK']=temp['INCOME'].rank(method='max', pct=True)
temp['POF']  = temp['INCOME'].apply(lambda x: scipy.stats.percentileofscore(temp['INCOME'], x, kind='weak'))
temp['QUANTILE_VALUE'] = temp['PCNT_RANK'].apply(lambda x: temp['INCOME'].quantile(x, 'lower'))
temp['RANK']=temp['INCOME'].rank(method='max')
sz = temp['RANK'].size - 1 
temp['PCNT_LIN'] = temp['RANK'].apply(lambda x: (x-1)/sz)
temp['CHK'] = temp['PCNT_LIN'].apply(lambda x: temp['INCOME'].quantile(x))

temp

Out[]:
   INCOME  PCNT_RANK    POF  QUANTILE_VALUE  RANK  PCNT_LIN    CHK
0      78        0.5   50.0              78   5.0  0.444444   78.0
1      38        0.2   20.0              38   2.0  0.111111   38.0
2      42        0.3   30.0              42   3.0  0.222222   42.0
3      48        0.4   40.0              48   4.0  0.333333   48.0
4      31        0.1   10.0              31   1.0  0.000000   31.0
5      89        0.6   60.0              89   6.0  0.555556   89.0
6      94        0.7   70.0              94   7.0  0.666667   94.0
7     102        0.8   80.0             102   8.0  0.777778  102.0
8     122        1.0  100.0             122  10.0  1.000000  122.0
9     122        1.0  100.0             122  10.0  1.000000  122.0

Now in a column PCNT_RANK you get ratio of values that are smaller or equal to the one in a column INCOME. But if you want the "interpolated" ratio, it is in column PCNT_LIN. And as you use Series.rank() for calculations, it is pretty fast and will crunch you 255M numbers in seconds.


Here I will explain how you get the value from using quantile() with linear interpolation:

temp['INCOME'].quantile(0.11)
37.93

Our data temp['INCOME'] has only ten values. According to the formula from your link to Wiki the rank of 11th percentile is

rank = 11*(10-1)/100 + 1 = 1.99

The truncated part of the rank is 1, which corresponds to the value 31, and the value with the rank 2 (i.e. next bin) is 38. The value of fraction is the fractional part of the rank. This leads to the result:

 31 + (38-31)*(0.99) = 37.93

For the values themselves, the fraction part have to be zero, so it is very easy to do the inverse calculation to get percentile:

p = (rank - 1)*100/(10 - 1)

I hope I made it more clear.

2 of 3
2

This seems to work:

A = np.sort(temp['INCOME'].values)
np.interp(sample, A, np.linspace(0, 1, len(A)))

For example:

>>> temp.INCOME.quantile(np.interp([37.5, 38, 122, 121], A, np.linspace(0, 1, len(A))))
0.103175     37.5
0.111111     38.0
1.000000    122.0
0.883333    121.0
Name: INCOME, dtype: float64

Please note that this strategy only makes sense if you want to query a large enough number of values. Otherwise the sorting is too expensive.

🌐
Python Forum
python-forum.io › thread-37787.html
pandas column percentile
Dear All, I have a time series in a dataframe that contain vaues for each day for many years. How can I compute the average of values above a certain percentile for, say, each month. Thank you
🌐
Data Science Parichay
datascienceparichay.com › home › blog › calculate percentile in python
Calculate Percentile in Python - Data Science Parichay
October 8, 2021 - Let’s now calculate the 95th percentile value for the “Day” column. Note that when using the pandas quantile() function pass the value of the nth percentile as a fractional value.
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.Series.quantile.html
pandas.Series.quantile — pandas 3.0.1 documentation
If q is an array, a Series will be returned where the index is q and the values are the quantiles, otherwise a float will be returned. ... Calculate the rolling quantile. ... Returns the q-th percentile(s) of the array elements.