numpy weighted quantile

stackoverflow.com › questions › 21844024 › weighted-percentile-using-numpy

Completely vectorized numpy solution

Here is the code I use. It's not an optimal one (which I'm unable to write with numpy), but still much faster and more reliable than accepted solution

def weighted_quantile(values, quantiles, sample_weight=None, 
                      values_sorted=False, old_style=False):
    """ Very close to numpy.percentile, but supports weights.
    NOTE: quantiles should be in [0, 1]!
    :param values: numpy.array with data
    :param quantiles: array-like with many quantiles needed
    :param sample_weight: array-like of the same length as `array`
    :param values_sorted: bool, if True, then will avoid sorting of
        initial array
    :param old_style: if True, will correct output to be consistent
        with numpy.percentile.
    :return: numpy.array with computed quantiles.
    """
    values = np.array(values)
    quantiles = np.array(quantiles)
    if sample_weight is None:
        sample_weight = np.ones(len(values))
    sample_weight = np.array(sample_weight)
    assert np.all(quantiles >= 0) and np.all(quantiles <= 1), \
        'quantiles should be in [0, 1]'

    if not values_sorted:
        sorter = np.argsort(values)
        values = values[sorter]
        sample_weight = sample_weight[sorter]

    weighted_quantiles = np.cumsum(sample_weight) - 0.5 * sample_weight
    if old_style:
        # To be convenient with numpy.percentile
        weighted_quantiles -= weighted_quantiles[0]
        weighted_quantiles /= weighted_quantiles[-1]
    else:
        weighted_quantiles /= np.sum(sample_weight)
    return np.interp(quantiles, weighted_quantiles, values)

Examples:

weighted_quantile([1, 2, 9, 3.2, 4], [0.0, 0.5, 1.])

array([ 1. , 3.2, 9. ])

weighted_quantile([1, 2, 9, 3.2, 4], [0.0, 0.5, 1.], sample_weight=[2, 1, 2, 4, 1])

array([ 1. , 3.2, 9. ])

Answer from Alleo on Stack Overflow

NumPy

numpy.org › doc › stable › reference › generated › numpy.quantile.html

numpy.quantile — NumPy v2.4 Manual

June 22, 2021 - For backward compatibility with previous versions of NumPy, quantile provides four additional discontinuous estimators. Like method='linear', all have m = 1 - q so that j = q*(n-1) // 1, but g is defined as follows. ... Weighted quantiles: More formally, the quantile at probability level \(q\) of a cumulative distribution function \(F(y)=P(Y \leq y)\) with probability measure \(P\) is defined as any number \(x\) that fulfills the coverage conditions

NumPy

numpy.org › devdocs › reference › generated › numpy.quantile.html

numpy.quantile — NumPy v2.5.dev0 Manual

For backward compatibility with previous versions of NumPy, quantile provides four additional discontinuous estimators. Like method='linear', all have m = 1 - q so that j = q*(n-1) // 1, but g is defined as follows. ... Weighted quantiles: More formally, the quantile at probability level \(q\) of a cumulative distribution function \(F(y)=P(Y \leq y)\) with probability measure \(P\) is defined as any number \(x\) that fulfills the coverage conditions

Videos

03:46

YouTube

NumPy Quantile Tutorial: Calculate Quantiles in Arrays with ...

October 29, 2025

youtube.com

NumPy & Quantiles - YouTube

October 12, 2023

11:24

YouTube

Understanding Quantiles in Python: A Step-by-Step Guide (Numpy ...

August 21, 2024

View all

Stack Overflow

stackoverflow.com › questions › 21844024 › weighted-percentile-using-numpy

python - Weighted percentile using numpy - Stack Overflow

Completely vectorized numpy solution

Here is the code I use. It's not an optimal one (which I'm unable to write with numpy), but still much faster and more reliable than accepted solution

def weighted_quantile(values, quantiles, sample_weight=None, 
                      values_sorted=False, old_style=False):
    """ Very close to numpy.percentile, but supports weights.
    NOTE: quantiles should be in [0, 1]!
    :param values: numpy.array with data
    :param quantiles: array-like with many quantiles needed
    :param sample_weight: array-like of the same length as `array`
    :param values_sorted: bool, if True, then will avoid sorting of
        initial array
    :param old_style: if True, will correct output to be consistent
        with numpy.percentile.
    :return: numpy.array with computed quantiles.
    """
    values = np.array(values)
    quantiles = np.array(quantiles)
    if sample_weight is None:
        sample_weight = np.ones(len(values))
    sample_weight = np.array(sample_weight)
    assert np.all(quantiles >= 0) and np.all(quantiles <= 1), \
        'quantiles should be in [0, 1]'

    if not values_sorted:
        sorter = np.argsort(values)
        values = values[sorter]
        sample_weight = sample_weight[sorter]

    weighted_quantiles = np.cumsum(sample_weight) - 0.5 * sample_weight
    if old_style:
        # To be convenient with numpy.percentile
        weighted_quantiles -= weighted_quantiles[0]
        weighted_quantiles /= weighted_quantiles[-1]
    else:
        weighted_quantiles /= np.sum(sample_weight)
    return np.interp(quantiles, weighted_quantiles, values)

Examples:

weighted_quantile([1, 2, 9, 3.2, 4], [0.0, 0.5, 1.])

array([ 1. , 3.2, 9. ])

weighted_quantile([1, 2, 9, 3.2, 4], [0.0, 0.5, 1.], sample_weight=[2, 1, 2, 4, 1])

array([ 1. , 3.2, 9. ])

2 of 13

This seems to be now implemented in statsmodels

from statsmodels.stats.weightstats import DescrStatsW
wq = DescrStatsW(data=np.array([1, 2, 9, 3.2, 4]), weights=np.array([0.0, 0.5, 1.0, 0.3, 0.5]))
wq.quantile(probs=np.array([0.1, 0.9]), return_pandas=False)
# array([2., 9.])

The DescrStatsW object also has other methods implemented, such as weighted mean, etc. https://www.statsmodels.org/stable/generated/statsmodels.stats.weightstats.DescrStatsW.html

GitHub

github.com › nudomarinero › wquantiles

GitHub - nudomarinero/wquantiles: weighted quantiles with Python

Weighted quantiles with Python, including weighted median. This library is based on numpy, which is the only dependence.

Starred by 53 users

Forked by 13 users

Languages Python 100.0% | Python 100.0%

NumPy

numpy.org › doc › 2.0 › reference › generated › numpy.quantile.html

numpy.quantile — NumPy v2.0 Manual

Weighted quantiles: For weighted quantiles, the above coverage conditions still hold. The empirical cumulative distribution is simply replaced by its weighted version, i.e. \(P(Y \leq t) = \frac{1}{\sum_i w_i} \sum_i w_i 1_{x_i \leq t}\).

Stack Exchange

math.stackexchange.com › questions › 3721765 › calculating-quantiles-of-weighted-array

vectors - Calculating quantiles of weighted array - Mathematics Stack Exchange

Version 0.2

This is still a toy implementation. In particular, it still might be hugely inefficient (I haven't given any thought to that question), and it still hasn't been tested on any large datasets. What is nice about it is that the new class multilist is obviously capable of being considerably elaborated. (No doubt I'll tinker with it a lot, but there isn't likely to be any good reason to post my tinkerings here.)

I'm not sure how to post code in Maths.SE, so the indentation of the code isn't quite consistent.

"""Lists of items with multiplicity, analogous to multisets."""

__all__ = ['individual', 'multilist', 'quantile']

import math, itertools

def individual(q, N):
    """
    Number (1 to N) of individual near q'th quantile of population of size N.
    """
    return math.floor(q*N) + 1 if q < 1 else N

def quantile(x, q):
    """
    Compute the q'th quantile value of the given *sorted* (N.B.!) multilist x.
    """
    return x[individual(q, len(x))]

class multilist(object):
    """
    List of elements with multiplicity: similar to a multiset, whence the name.
    
    The multiplicity of each element is a positive integer. The purpose of the
    multilist is to behave like a list in which each element occurs many times,
    without actually having to store all of those occurrences.
    """

def __init__(self, x, w):
    """
    Create multilist from list of values and list of their multiplicities.
    """
    self.items = x
    self.times = w
    self.subtotals = list(itertools.accumulate(self.times))

def __len__(self):
    """
    Get the number of items in a list with multiplicities.
    
    The syntax needed to call this function is "len(x)", where x is the
    name of the multilist.
    """
    return self.subtotals[-1]

def __getitem__(self, k):
    """
    Find the k'th item in a list with multiplicities.
    
    If the multiplicities are m_1, m_2, ..., m_r (note that Python indices
    are 1 less, running from 0 to r - 1), and subtotals M_0, M_1, ..., M_r,
    where M_i = m_1 + m_2 + ... + m_i (i = 0, 1, ..., r), then we want the
    unique i (but the Python code uses i - 1) such that M_{i-1} < k <= M_i.
    
    The syntax needed to call this function is "x[k]", where x is the name
    of the multilist, and 1 <= k <= len(x).
    """
    for i, M in enumerate(self.subtotals):
        if M >= k:
            return self.items[i]

def sorted(self):
    """
    Return a sorted copy of the given multilist.
    
    Note on the implementation: by default, 2-tuples in Python are compared
    lexicographically, i.e. by the first element, or the second in the case
    of a tie; so there is no need for parameter key=operator.itemgetter(0).
    """
    return multilist(*zip(*sorted(zip(self.items, self.times))))

def main():
    data = multilist([6, 4, 2], [1, 3, 5]).sorted()
    print('median = {}'.format(quantile(data, .5)))

if __name__ == '__main__':
    main()

NumPy

numpy.org › doc › 2.1 › reference › generated › numpy.quantile.html

numpy.quantile — NumPy v2.1 Manual

NumPy

numpy.org › doc › 2.2 › reference › generated › numpy.quantile.html

numpy.quantile — NumPy v2.2 Manual

Find elsewhere

Google Bing Mojeek

PyPI

pypi.org › project › wquantiles

wquantiles · PyPI

Weighted quantiles, including weighted median, based on numpy

      » pip install wquantiles

Published May 26, 2021

Version 0.6

Homepage http://github.com/nudomarinero/wquantiles/

NumPy

numpy.org › doc › 2.2 › reference › generated › numpy.nanquantile.html

numpy.nanquantile — NumPy v2.2 Manual

An array of weights associated with the values in a. Each value in a contributes to the quantile according to its associated weight. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a.

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.quantile.html

quantile — SciPy v1.17.0 Manual

Frequency weights; e.g., for counting number weights, quantile(x, p, weights=weights) is equivalent to quantile(np.repeat(x, weights), p). Values other than finite counting numbers are accepted, but may not have valid statistical interpretations. Not compatible with method='harrell-davis' or those that begin with 'round_'. Returns: quantilescalar or ndarray · The resulting quantile(s). The dtype is the result dtype of x and p. See also · numpy.quantile ·

NumPy

numpy.org › doc › 2.3 › reference › generated › numpy.quantile.html

numpy.quantile — NumPy v2.3 Manual

Medium

medium.com › @amit25173 › understanding-quartiles-in-numpy-step-by-step-80fb48b5587a

Understanding Quartiles in NumPy (Step-by-Step) | by Amit Yadav | Medium

February 8, 2025 - NumPy interpolates values when the dataset has an even number of elements. That’s why Q1 and Q3 might not be exact values from the dataset but calculated as a weighted average between two closest values.

NumPy

numpy.org › doc › stable › reference › generated › numpy.percentile.html

numpy.percentile — NumPy v2.4 Manual

An array of weights associated with the values in a. Each value in a contributes to the percentile according to its associated weight. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a.

GitHub

github.com › numpy › numpy › issues › 8935

Weighted quantile option in nanpercentile() · Issue #8935 · numpy/numpy

April 12, 2017 - >>> out = weighted_quantile(da=ar, q=[0.25, 0.5, 0.75], dim=['x', 'y'], w_dict={'x': [1, 1]}, interpolation='nearest') >>> out <xarray.DataArray (quantile: 3, z: 2)> array([[ 8., 1.], [ 8., 3.], [ 8., 3.]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 >>> np.nanpercentile(da_stacked, q=[25, 50, 75], axis=-1, interpolation='nearest') array([[ 8., 1.], [ 8., 3.], [ 8., 3.]]) We wonder if it's ok to make this feature part of numpy, probably in np.nanpercentile?

GitHub

github.com › nudomarinero › wquantiles › blob › master › wquantiles.py

wquantiles/wquantiles.py at master · nudomarinero/wquantiles

Library to compute weighted quantiles, including the weighted median, of · numpy arrays. """ from __future__ import print_function · import numpy as np · · __version__ = "0.4" · · def quantile_1D(data, weights, quantile): """ Compute the weighted quantile of a 1D numpy array.

Author nudomarinero

statsmodels

statsmodels.org › dev › generated › statsmodels.stats.weightstats.DescrStatsW.quantile.html

statsmodels.stats.weightstats.DescrStatsW.quantile - statsmodels 0.15.0 (+946)

Compute quantiles for a weighted sample.

Xarray

docs.xarray.dev › en › v2025.04.0 › generated › xarray.computation.weighted.DatasetWeighted.quantile.html

xarray.computation.weighted.DatasetWeighted.quantile

April 29, 2025 - Apply a weighted quantile to this Dataset’s data along some dimension(s). Weights are interpreted as sampling weights (or probability weights) and describe how a sample is scaled to the whole population [1]. There are other possible interpretations for weights, precision weights describing the precision of observations, or frequency weights counting the number of identical observations, however, they are not implemented here. For compatibility with NumPy’s non-weighted quantile (which is used by DataArray.quantile and Dataset.quantile), the only interpolation method supported by this weighted version corresponds to the default “linear” option of numpy.quantile.

GitHub

github.com › numpy › numpy › issues › 6326

weighted percentile · Issue #6326 · numpy/numpy

September 17, 2015 - Support for weights in percentile would be nice to have. A quick look suggests https://github.com/nudomarinero/wquantiles; I'd be happy to make a PR out of this implementation if there's in...

Author anntzer

NumPy

numpy.org › devdocs › reference › generated › numpy.percentile.html

numpy.percentile — NumPy v2.5.dev0 Manual