dispersion of the values ​​of a random variable around its expected value

Statistical Techniques for Transportation Engineering
{\displaystyle {\sqrt {2}}\,\sigma }
{\displaystyle \sigma ={\sqrt {4}}=2.}
{\displaystyle {\sqrt {\left(e^{\sigma ^{2}}-1\right)\ e^{2\mu +\sigma ^{2}}}}\,.}
In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its mean. A low standard deviation indicates that the values tend to … Wikipedia
🌐
Quora
quora.com β€Ί What-standard-deviation-is-considered-high
What standard deviation is considered high? - Quora
Answer (1 of 5): Any standard deviation value above or equal to 2 can be considered as high. In a normal distribution, there is an empirical assumption that most of the data will be spread-ed around the mean.
🌐
Reddit
reddit.com β€Ί r/datascience β€Ί i still don't understand what standard deviation, even in grad school.
I still don't understand what standard deviation, even in grad school. : r/datascience
January 14, 2023 - Basically variance or standard deviations are a measure of how surprised you are to see numbers that are far away from the usual. So for example if you took everyone's age in the world, the average would likely be 40ish, just a random guess. But it's not super crazy to see people who are newborns or 80 years old. On the other hand say you're measuring the age of people in high ...
🌐
LeanScape
leanscape.io β€Ί home β€Ί lean wiki β€Ί demystifying standard deviation: a beginner’s guide
Demystifying Standard Deviation: A Beginner's Guide - LeanScape
September 23, 2024 - This might be considered β€œgood” in contexts where consistency or predictability is desired. Conversely, a high standard deviation (significantly higher than 1) indicates that data points spread out over a wider range, signifying high variability.
🌐
GeeksforGeeks
geeksforgeeks.org β€Ί mathematics β€Ί how-to-determine-if-standard-deviation-is-high-low
How to Determine if Standard Deviation Is High/Low? - GeeksforGeeks
February 12, 2024 - A high standard deviation indicates that the data points are spread out over a wider range, while a low standard deviation suggests that the data points are clustered closely around the mean.
🌐
Statology
statology.org β€Ί home β€Ί what is considered a good standard deviation?
What is Considered a Good Standard Deviation?
May 10, 2021 - In general, a CV value greater than 1 is often considered high. For example, suppose a realtor collects data on the price of 100 houses in her city and finds that the mean price is $150,000 and the standard deviation of prices is $12,000.
🌐
University of Waterloo
uwaterloo.ca β€Ί teaching-assessment-processes β€Ί about-standard-deviation
About the standard deviation | Teaching Assessment Processes | University of Waterloo
October 6, 2023 - Standard deviation indicates the variability of dataβ€”the degree to which SCP scores vary around the mean. A higher standard deviation means that there is high variability in the data. A lower standard deviation means that there is less variability in the data.
Find elsewhere
🌐
JMP
jmp.com β€Ί en β€Ί statistics-knowledge-portal β€Ί measures-of-central-tendency-and-variability β€Ί standard-deviation
Standard Deviation
The standard deviation measures the spread of a set of data values. A high standard deviation indicates a wide spread of data values, while a low standard deviation indicates a narrow spread of values clustered around the mean of the data set.
🌐
National Library of Medicine
nlm.nih.gov β€Ί oet β€Ί ed β€Ί stats β€Ί 02-900.html
Standard Deviation - Finding and Using Health Statistics - NIH
A standard deviation (or Οƒ) is a measure of how dispersed the data is in relation to the mean. Low, or small, standard deviation indicates data are clustered tightly around the mean, and high, or large, standard deviation indicates data are more spread out.
🌐
Investopedia
investopedia.com β€Ί terms β€Ί s β€Ί standarddeviation.asp
Standard Deviation Formula and Uses, vs. Variance
June 5, 2025 - The greater the standard deviation ... For example, a volatile stock has a high standard deviation, meaning that its price goes up and down frequently....
🌐
The Journalist's Resource
journalistsresource.org β€Ί home β€Ί what’s standard deviation? 4 things journalists need to know
What's standard deviation? 4 things journalists need to know
October 5, 2022 - When researchers analyze quantitative ... to gauge how close or far apart the data are. A higher standard deviation means the data are more spread out....
🌐
Reddit
reddit.com β€Ί r/askstatistics β€Ί can i justify the standard deviation is high in this case?
r/AskStatistics on Reddit: Can I justify the standard deviation is high in this case?
September 14, 2023 -

[SOLVED by u/BurkeyAcademy]

Hello, could you review the following logic:

I designed an item with a length of 5.00 metre. After manufacturing, I collected data of hundreds of lengths of the item:

Mean = 5.10m (i.e. average 0.10m longer) Standard deviation = 0.30m

My view is that we cannot trust the 0.10m value, because the standard deviation is 0.30m. However, I looked up how to justify variability, and found Coefficient of Variability (CV) which is based on SD/mean.

Also, are there any way I can assess SD against the 0.10m value?

Top answer
1 of 2
5
Trigger warning to stats pedants-- I am doing some hand wavy things below to help answer this person's question in an easy to understand way. Please don't get triggered. ☺ If he wants I can "do the exactly correct math" and go through the epistemological underpinnings of inference... but that doesn't seem to be what is asked. (Not that what I do is wrong, but it would get points taken off a university exam). My view is that we cannot trust the 0.10m value, because the standard deviation is 0.30m. The standard deviation measures the variability of one bar to the next-- I think about it as "a common difference an individual bar is from the average length*". So, if the real average is indeed 5.1, many bars will be in the range 4.8 to 5.4. Using something called "the empirical rule", we might guess that somewhere in the neighborhood of 2/3 of the bars might be in this range-- check this for fun! (Note, this makes some assumptions about the "shape" of the distribution of lengths of course, but it wouldn't surprise me if this was approximately true-- say between 55% and 80% of bars in the range I mentioned.) The standard error seems to be a good fit if your goal is to see how accurate your .10 measurement difference might be. You say that you measured hundreds... let's assume that means 200 for the moment. Take your sd=0.3 and divide by sqrt(200) and we get 0.021. This is a measurement of the "precision" to which we can estimate the average length of the entire population of bars. In a similar manner to what I said above, we can be reasonably confident that the average of all bars generated by the same process (assuming the process isn't varying over time, like having different people cut them) will be in the range 5.1 +/- 0.021. It is common to multiply this 0.021 by a number around 2 to make a "95% confidence interval for the mean". If we do this, we can say something kind of like: Given this data we can be around 95% confident that the range 5.1 +/- 0.042 or 5.058 to 5.142 might contain the true average length (not strictly the correct interpretation, but close enough as an introduction to these concepts on Reddit). So, this helps you understand how much you can trust the 0.1 difference. While it probably isn't exactly 0.1, the average is probably between 0.058 and 0.142 longer than 5 meters. Please feel free to ask followup questions if you like.
2 of 2
2
I am not following your logic here. If you have calculated the mean of the sample to be 5.10 then that is the mean of the sample - there is nothing that you need to "trust". If you are saying that the SD is high, so you feel the estimate of the mean is imprecise then that may be true. In which case, that would indicate the variability of the lengths is high. There are items that have a much smaller length than 5.10 and others with a much higher length. Assuming normal distribution of the lengths, 95% of the items will have a length in the range 5.10 +/- 0.60. Regarding your question about assessing the SD relative to the mean. Firstly, the most important thing is your field-specific knowledge on the subject. Is a 0.30 deviation from 5.00 acceptable? If you want the item to be made accurate to the nearest mm then obviously this SD is very large. However, if it only needs to be made accurate to the nearest metre, then this SD may be acceptable. This very much depends on the use case and is not something that can be solved with maths alone. For computing a value on the relationship between the SD and the mean you can use (as you have already said) the coefficient of variation, or one of its many alternatives.
🌐
Pacific Northwest Doves & Pigeons
science.halleyhosting.com β€Ί sci β€Ί soph β€Ί inquiry β€Ί standdev2.htm
What Does Standard Deviation Show us About Our Data?
Standard deviation is a mathematical tool to help us assess how far the values are spread above and below the mean. A high standard deviation shows that the data is widely spread (less reliable) and a low standard deviation shows that the data are clustered closely around the mean (more reliable).
🌐
Quora
quora.com β€Ί What-is-high-standard-deviation
What is 'high standard deviation'? - Quora
Answer (1 of 8): Standard deviation ... the numbers are very close to the average. A high standard deviation means that the numbers are spread out....
Top answer
1 of 2
2

As other users have mentioned in the comments, "small" and "large" are arbitrary and depend on the context. However, one very simple way to think about whether a standard deviation is small or large is as follows. If you assume that your data is normally distributed, then approximately 68% of your data points fall between one standard deviation below the mean, and one standard deviation above the mean. In the case of your data, this would mean 68% of students scored between roughly 63 and 95, and conversely 32% scored either above 95 or below 63. This gives a practical way to understand what your standard deviation is telling you (again, under the assumption that your data is normal). If you would have expected a greater percentage to fall between 63 and 95, then your standard deviation may be considered large, and if you would have expected a smaller percentage, then your standard deviation may be considered small.

2 of 2
0

I believe that standard deviation is a tool to compare two data sets or more. Thus, the higher standard deviation of dataset will be the one considered large where data are more spread-out in relationships to the mean. On the other hand, a lower standard deviation will be considered small.

Also, it is a tool to evaluate how the numbers are spread-out from one data set.

the standard deviation could be considered big or small based on whatever reason the data set is serving. Example salaries of entry-level jobs, run-time of one mile for a particular sport team. for the sport team, you may have one athlete that is way faster than the others. Thus, we can use standard deviation to see how far he is above the mean. bottom line. it depends on how you want to use your data. If you think it is small, it is small. if you think it is big, it i

🌐
Standard Deviation Calculator
standarddeviationcalculator.io β€Ί blog β€Ί how-to-interpret-standard-deviation-results
How to Interpret Standard Deviation Results
A low standard deviation signifies that the values tend to be close to the mean, whereas a high standard deviation indicates that the values are spread out over a wider range.
🌐
Z Score Table
z-table.com β€Ί what-is-high-standard-deviation.html
What is High Standard Deviation - Z SCORE TABLE
Interpreting Standard Deviation Standard deviation measures the spread or dispersion of data points around the mean of a dataset. A high standard deviation indicates that data points are widely dispersed from the mean, suggesting greater variability within the dataset.
Top answer
1 of 2
6

Discussion of the new question:

For example, if I want to study human body size and I find that adult human body size has a standard deviation of 2 cm, I would probably infer that adult human body size is very uniform

It depends on what we're comparing to. What's the standard of comparison that makes that very uniform? If you compare it to the variability in bolt-lengths for a particular type of bolt that might be hugely variable.

while a 2 cm standard deviation in the size of mice would mean that mice differ surprisingly much in body size.

By comparison to the same thing in your more-uniform humans example, certainly; when it comes to lengths of things, which can only be positive, it probably makes more sense to compare coefficient of variation (as I point out in my original answer), which is the same thing as comparing sd to mean you're suggesting here.

Obviously the meaning of the standard deviation is its relation to the mean,

No, not always. In the case of sizes of things or amounts of things (e.g. tonnage of coal, volume of money), that often makes sense, but in other contexts it doesn't make sense to compare to the mean.

Even then, they're not necessarily comparable from one thing to another. There's no applies-to-all-things standard of how variable something is before it's variable.

and a standard deviation around a tenth of the mean is unremarkable (e.g. for IQ: SD = 0.15 * M).

Which things are we comparing here? Lengths to IQ's? Why does it make sense to compare one set of things to another? Note that the choice of mean 100 and sd 15 for one kind of IQ test is entirely arbitrary. They don't have units. It could as easily have been mean 0 sd 1 or mean 0.5 and sd 0.1.

But what is considered "small" and what is "large", when it comes to the relation between standard deviation and mean?

Already covered in my original answer but more eloquently covered in whuber's comment -- there is no one standard, and there can't be.

Some of my points about Cohen there still apply to this case (sd relative to mean is at least unit-free); but even with something like say Cohen's d, a suitable standard in one context isn't necessarily suitable in another.


Answers to an earlier version

We always calculate and report means and standard deviations.

Well, maybe a lot of the time; I don't know that I always do it. There's cases where it's not that relevant.

But what does the size of the variance actually mean?

The standard deviation is a kind of average* distance from the mean. The variance is the square of the standard deviation. Standard deviation is measured in the same units as the data; variance is in squared units.

*(RMS -- https://en.wikipedia.org/wiki/Root_mean_square)

They tell you something about how "spread out" the data are (or the distribution, in the case that you're calculating the sd or variance of a distribution).

For example, assume we are observing which seat people take in an empty room. If we observe that the majority of people sit close to the window with little variance,

That's not exactly a case of recording "which seat" but recording "distance from the window". (Knowing "the majority sit close to the window" doesn't necessarily tell you anything about the mean nor the variation about the mean. What it tells you is that the median distance from the window must be small.)

we can assume this to mean that people generally prefer siting near the window and getting a view or enough light is the main motivating factor in choosing a seat.

That the median is small doesn't of itself tell you that. You might infer it from other considerations, but there may be all manner of reasons for it that we can't in any way discern from the data.

If on the other hand we observe that while the largest proportion sit close to the window there is a large variance with other seats taken often also (e.g. many sit close to the door, others sit close to the water dispenser or the newspapers), we might assume that while many people prefer to sit close to the window, there seem to be more factors than light or view that influence choice of seating and differing preferences in different people.

Again, you're bringing in information outside the data; it might apply or it might not. For all we know the light is better far from the window, because the day is overcast or the blinds are drawn.

At what values can we say that the behavior we have observed is very varied (different people like to sit in different places)?

What makes a standard deviation large or small is not determined by some external standard but by subject matter considerations, and to some extent what you're doing with the data, and even personal factors.

However, with positive measurements, such as distances, it's sometimes relevant to consider standard deviation relative to the mean (the coefficient of variation); it's still arbitrary, but distributions with coefficients of variation much smaller than 1 (standard deviation much smaller than the mean) are "different" in some sense than ones where it's much greater than 1 (standard deviation much larger than the mean, which will often tend to be heavily right skew).

And when can we infer that behavior is mostly uniform (everyone likes to sit at the window)

Be wary of using the word "uniform" in that sense, since it's easy to misinterpret your meaning (e.g. if I say that people are "uniformly seated about the room" that means almost the opposite of what you mean). More generally, when discussing statistics, generally avoid using jargon terms in their ordinary sense.

and the little variation our data shows is mostly a result of random effects or confounding variables (dirt on one chair, the sun having moved and more shade in the back, etc.)?

No, again, you're bringing in external information to the statistical quantity you're discussing. The variance doesn't tell you any such thing.

Are there guidelines for assessing the magnitude of variance in data, similar to Cohen's guidelines for interpreting effect size (a correlation of 0.5 is large, 0.3 is moderate, and 0.1 is small)?

Not in general, no.

  1. Cohen's discussion[1] of effect sizes is more nuanced and situational than you indicate; he gives a table of 8 different values of small medium and large depending on what kind of thing is being discussed. Those numbers you give apply to differences in independent means (Cohen's d).

  2. Cohen's effect sizes are all scaled to be unitless quantities. Standard deviation and variance are not -- change the units and both will change.

  3. Cohen's effect sizes are intended to apply in a particular application area (and even then I regard too much focus on those standards of what's small, medium and large as both somewhat arbitrary and somewhat more prescriptive than I'd like). They're more or less reasonable for their intended application area but may be entirely unsuitable in other areas (high energy physics, for example, frequently require effects that cover many standard errors, but equivalents of Cohens effect sizes may be many orders of magnitude more than what's attainable).

For example, if 90% (or only 30%) of observations fall within one standard deviation from the mean, is that uncommon or completely unremarkable?

Ah, note now that you have stopped discussing the size of standard deviation / variance, and started discussing the proportion of observations within one standard deviation of the mean, an entirely different concept. Very roughly speaking this is more related to the peakedness of the distribution.

For example, without changing the variance at all, I can change the proportion of a population within 1 sd of the mean quite readily. If the population has a $t_3$ distribution, about 94% of it lies within 1 sd of the mean, if it has a uniform distribution, about 58% lies within 1 sd of the mean; and with a beta($\frac18,\frac18$) distribution, it's about 29%; this can happen with all of them having the same standard deviations, or with any of them being larger or smaller without changing those percentages -- it's not really related to spread at all, because you defined the interval in terms of standard deviation.

[1]: Cohen J. (1992),
"A power primer,"
Psychol Bull., 112(1), Jul: 155-9.

2 of 2
4

By Chebyshev's inequality we know that probability of some $x$ being $k$ times $\sigma$ from mean is at most $\frac{1}{k^2}$:

$$ \Pr(|X-\mu|\geq k\sigma) \leq \frac{1}{k^2} $$

However with making some distributional assumptions you can be more precise, e.g. Normal approximation leads to 68–95–99.7 rule. Generally using any cumulative distribution function you can choose some interval that should encompass a certain percentage of cases. However choosing confidence interval width is a subjective decision as discussed in this thread.

Example
The most intuitive example that comes to my mind is intelligence scale. Intelligence is something that cannot be measured directly, we do not have direct "units" of intelligence (by the way, centimeters or Celsius degrees are also somehow arbitrary). Intelligence tests are scored so that they have mean of 100 and standard deviation of 15. What does it tell us? Knowing mean and standard deviation we can easily infer which scores can be regarded as "low", "average", or "high". As "average" we can classify such scores that are obtained by most people (say 50%), higher scores can be classified as "above average", uncommonly high scores can be classified as "superior" etc., this translates to table below.

Wechsler (WAIS–III) 1997 IQ test classification IQ Range ("deviation IQ")

IQ Classification
130 and above Very superior
120–129       Superior
110–119       High average
90–109        Average
80–89         Low average
70–79         Borderline
69 and below  Extremely low

(Source: https://en.wikipedia.org/wiki/IQ_classification)

So standard deviation tells us how far we can assume individual values be distant from mean. You can think of $\sigma$ as of unitless distance from mean. If you think of observable scores, say intelligence test scores, than knowing standard deviations enables you to easily infer how far (how many $\sigma$'s) some value lays from the mean and so how common or uncommon it is. It is subjective how many $\sigma$'s qualify as "far away", but this can be easily qualified by thinking in terms of probability of observing values laying in certain distance from mean.

This is obvious if you look on what variance ($\sigma^2$) is

$$ \operatorname{Var}(X) = \operatorname{E}\left[(X - \mu)^2 \right]. $$

...the expected (average) distance of $X$'s from $\mu$. If you wonder, than here you can read why is it squared.