Brave Search

numpy standard deviation does not give the same result as scipy stats standard deviation

stackoverflow.com › questions › 74276989 › numpy-standard-deviation-does-not-give-the-same-result-as-scipy-stats-standard-d

It's in my mind a while ago..To get the same values

Copyimport numpy as np
import scipy.stats
ar = np.arange(20)
print(np.std(ar, ddof=1))
print(scipy.stats.tstd(ar))

output #

Copy5.916079783099616
5.916079783099616

My mentor use to say

-->ddof=1 if you're calculating np.std() for a sample taken from your complete dataset.

---> ddof=0 if you're calculating for the full population

Answer from Bhargav on Stack Overflow

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.tstd.html

tstd — SciPy v1.17.0 Manual

scipy.stats.tstd(a, limits=None, inclusive=(True, True), axis=0, ddof=1, *, nan_policy='propagate', keepdims=False)[source]#

SciPy

docs.scipy.org › doc › scipy-0.7.x › reference › generated › scipy.stats.std.html

scipy.stats.std — SciPy v0.7 Reference Guide (DRAFT)

scipy.stats.std(a, axis=0, bias=False)¶ · Returns the estimated population standard deviation of the values in the passed array (i.e., N-1). Axis can equal None (ravel array first), or an integer (the axis over which to operate). scipy.stats.var · scipy.stats.stderr ·

SciPy

docs.scipy.org › doc › scipy › reference › stats.html

Statistical functions (scipy.stats) — SciPy v1.17.0 Manual

trimmed_stde · trimr · trimr · trimtail · trimtail · trimboth · trimboth · winsorize · winsorize · zmap · zmap · zscore · zscore · Other · argstoarray · argstoarray · count_tied_groups · count_tied_groups · msign · msign · compare_medians_ms · compare_medians_ms · median_cihs · median_cihs · mjci · mjci · mquantiles_cimj · mquantiles_cimj · rsh · rsh · Random Number Generators (scipy.stats.sampling) Generators Wrapped ·

SciPy

docs.scipy.org › doc › scipy-0.8.x › reference › generated › scipy.stats.std.html

scipy.stats.std — SciPy v0.8 Reference Guide (DRAFT)

September 2, 2010 - scipy.stats.std(a, axis=0, bias=False)¶ · Returns the estimated population standard deviation of the values in the passed array (i.e., N-1). Axis can equal None (ravel array first), or an integer (the axis over which to operate). scipy.stats.var · scipy.stats.stderr ·

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.gstd.html

gstd — SciPy v1.17.0 Manual

where \(n\) is the number of observations, \(d\) is the adjustment ddof to the degrees of freedom, and \(\bar y\) denotes the mean of the natural logarithms of the observations. Note that the default ddof=1 is different from the default value used by similar functions, such as numpy.std and numpy.var.

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.rv_continuous.std.html

std — SciPy v1.17.0 Manual

scipy.stats.rv_continuous. rv_continuous.std(*args, **kwds)[source]# Standard deviation of the distribution. Parameters: arg1, arg2, arg3,…array_like · The shape parameter(s) for the distribution (see docstring of the instance object for more information) locarray_like, optional ·

GeeksforGeeks

geeksforgeeks.org › python › scipy-stats-tstd-function-python

sciPy stats.tstd() function | Python - GeeksforGeeks

February 10, 2019 - scipy.stats.tstd(array, limits=None, inclusive=(True, True)) calculates the trimmed standard deviation of the array elements along the specified axis of the array. It's formula - Parameters : array: Input array or object having the elements ...

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.circstd.html

circstd — SciPy v1.17.0 Manual

If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output.

SciPy

docs.scipy.org › doc › scipy › tutorial › stats.html

Statistics (scipy.stats) — SciPy v1.17.0 Manual

In this tutorial, we discuss many, but certainly not all, features of scipy.stats. The intention here is to provide a user with a working knowledge of this package.

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 74276989 › numpy-standard-deviation-does-not-give-the-same-result-as-scipy-stats-standard-d

python - numpy standard deviation does not give the same result as scipy stats standard deviation - Stack Overflow

Top answer

1 of 1

The scipy.stats.lognorm lognormal distribution is parameterised in a slightly unusual way, in order to be consistent with the other continuous distributions. The first argument is the shape parameter, which is your sigma. That's followed by the loc and scale arguments, which allow shifting and scaling of the distribution. Here you want loc=0.0 and scale=exp(mu). So to compute the mean, you want to do something like:

>>> import numpy as np
>>> from scipy.stats import lognorm
>>> mu = 0.4104857306
>>> sigma = 3.4070874277012617
>>> lognorm.mean(sigma, 0.0, np.exp(mu))
500.0000010889041

Or more clearly: pass the scale parameter by name, and leave the loc parameter at its default of 0.0:

>>> lognorm.mean(sigma, scale=np.exp(mu))
500.0000010889041

As @coldspeed says in his comment, your expected value for the standard deviation doesn't look right. I get:

>>> lognorm.std(sigma, scale=np.exp(mu))
165831.2402402415

and I get the same value calculating by hand.

To double check that these parameter choices are indeed giving the expected lognormal, I created a sample of a million deviates and looked at the mean and standard deviation of the log of that sample. As expected, those give me back values that look roughly like your original mu and sigma:

>>> samples = lognorm.rvs(sigma, scale=np.exp(mu), size=10**6)
>>> np.log(samples).mean()  # should be close to mu
0.4134644116056518
>>> np.log(samples).std(ddof=1)  # should be close to sigma
3.4050012251732285

In response to the edit: you've got the formula for the variance of a lognormal slightly wrong - you need to replace the exp(sigma**2 - 1) term with (exp(sigma**2) - 1). If you do that, and rerun the fsolve computation, you get:

sigma = 0.9444564779275075
mu = 5.768609079062494

And with those values, you should get the expected mean and standard deviation:

>>> from scipy.stats import lognorm
>>> import numpy as np
>>> sigma = 0.9444564779275075
>>> mu = 5.768609079062494
>>> lognorm.mean(sigma, scale=np.exp(mu))
499.9999999949592
>>> lognorm.std(sigma, scale=np.exp(mu))
599.9999996859631

Rather than using fsolve, you could also solve analytically for sigma and mu, given the desired mean and standard deviation. This gives you more accurate results, more quickly:

>>> mean = 500.0
>>> var = 600.0**2
>>> sigma = np.sqrt(np.log1p(var/mean**2))
>>> mu = np.log(mean) - 0.5*sigma*sigma
>>> mu, sigma
(5.768609078769636, 0.9444564782482624)
>>> lognorm.mean(sigma, scale=np.exp(mu))
499.99999999999966
>>> lognorm.std(sigma, scale=np.exp(mu))
599.9999999999995

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.mstats.trimmed_std.html

trimmed_std — SciPy v1.16.2 Manual

scipy.stats.mstats.trimmed_std(a, limits=(0.1, 0.1), inclusive=(1, 1), relative=True, axis=None, ddof=0)[source]# Returns the trimmed standard deviation of the data along the given axis. Parameters: asequence · Input array · limits{None, tuple}, optional ·

SciPy

scipy.github.io › devdocs › reference › generated › scipy.stats.rv_continuous.std.html

std — SciPy v1.17.0.dev Manual

NumPy

numpy.org › doc › stable › reference › generated › numpy.std.html

numpy.std — NumPy v2.4 Manual

N = len(a) d2 = abs(a - mean)**2 ... `ddof` std = var**0.5 · Different values of the argument ddof are useful in different contexts. NumPy’s default ddof=0 corresponds with the expression: ... which is sometimes called the “population standard deviation” in the field of statistics because it ...

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.norm.html

scipy.stats.norm — SciPy v1.17.0 Manual

A normal continuous random variable · The location (loc) keyword specifies the mean. The scale (scale) keyword specifies the standard deviation

Python

docs.python.org › 3 › library › statistics.html

statistics — Mathematical statistics functions

Makes a normal distribution instance with mu and sigma parameters estimated from the data using fmean() and stdev(). The data can be any iterable and should consist of values that can be converted to type float. If data does not contain at least two elements, raises StatisticsError because it takes at least one point to estimate a central value and at least two points to estimate dispersion.

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.stats.sem.html

sem — SciPy v1.17.0 Manual

If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output.