How do I evaluate standard deviation? - Cross Validated
How to Determine if Standard Deviation Is High/Low?
descriptive statistics - Large vs. Small Standard Deviation - Cross Validated
ELI5: What is standard deviation?
Two groups of students take an exam.
Group A averages 70%. Nearly every student gets a score of like 67%, 71%, 73%, etc.
Group B averages 70%. A lot of students get 100s, but a lot of other students bomb the test and get 40%.
If you look only at the average, the two groups look the same - they both averaged a 70. But they're not the same. Group B has a bunch of geniuses and a bunch of idiots, while Group A has mostly average people.
The standard deviation is a measure of how "spread out" the data is. A low standard deviation means that most of the data points are very close to the average, while a high standard deviation means that there's a lot of data that's spread out far from the average. So Group B has a much higher standard deviation than Group A.
More on reddit.comVideos
Standard deviations aren't "good" or "bad". They are indicators of how spread out your data is. Sometimes, in ratings scales, we want wide spread because it indicates that our questions/ratings cover the range of the group we are rating. Other times, we want a small sd because we want everyone to be "high".
For example, if you were testing the math skills of students in a calculus course, you could get a very small sd by asking them questions of elementary arithmetic such as . But suppose you gave a more serious placement test for calculus (that is, students who passed would go into Calculus I, those who did not would take lower level courses first). You might expect a lower sd (and a higher average) among freshman at MIT than at South Podunk State, given the same test.
So. What is the purpose of your test? Who are in the sample?
Short answer, it's fine and a bit lower than I might have expected from survey data. But probably your business story is more in the mean or the top-2-box percent.
For discrete scales from social science research, in practice the standard deviation is a direct function of the mean. In particular, I have found through empirical analysis of many such studies that the actual standard deviation in surveys on 5-point scales is 40%-60% of the maximum possible variation (alas undocumented here).
At the simplest level, consider the extremes, imagine that the mean was 5.0. The standard deviation must be zero, as the only way to average 5 is for everyone to answer 5. Conversely, if the mean were 1.0 then the standard error must be 0 as well. So the standard deviation is precisely defined given the mean.
Now in between there's more grey area. Imagine that people could answer either 5.0 or 1.0 but nothing in between. Then the standard deviation is a precise function of the mean:
stdev = sqrt ( (5-mean)*(mean-1))
The maximum standard deviation for answers on any bounded scale is half the scale width. Here that's sqrt((5-3)(3-1)) = sqrt(2*2)=2.
Now of course people can answer values in between. From metastudies of survey data in our firm, I find that the standard deviation for numeric scales in practice is 40%-60% of the maximum. Specifically
- 40% for 100% point scales,
- 50% for 10-point scales and
- 60% for 5-point scales and
- 100% for binary scales
So for your dataset, I would expect a standard deviation of 60% x 2.0 = 1.2. You got 0.54, which is about half what i would have expected if the results were self-explicated ratings. Are the skills ratings results of more complicated batteries of tests that are averages and thus would have a lower variance?
The real story, though, is probably the ability is so low or so high relative to other tasks. Report the means or top-2-box percentages between skills and focus your analysis on that.
As other users have mentioned in the comments, "small" and "large" are arbitrary and depend on the context. However, one very simple way to think about whether a standard deviation is small or large is as follows. If you assume that your data is normally distributed, then approximately 68% of your data points fall between one standard deviation below the mean, and one standard deviation above the mean. In the case of your data, this would mean 68% of students scored between roughly 63 and 95, and conversely 32% scored either above 95 or below 63. This gives a practical way to understand what your standard deviation is telling you (again, under the assumption that your data is normal). If you would have expected a greater percentage to fall between 63 and 95, then your standard deviation may be considered large, and if you would have expected a smaller percentage, then your standard deviation may be considered small.
I believe that standard deviation is a tool to compare two data sets or more. Thus, the higher standard deviation of dataset will be the one considered large where data are more spread-out in relationships to the mean. On the other hand, a lower standard deviation will be considered small.
Also, it is a tool to evaluate how the numbers are spread-out from one data set.
the standard deviation could be considered big or small based on whatever reason the data set is serving. Example salaries of entry-level jobs, run-time of one mile for a particular sport team. for the sport team, you may have one athlete that is way faster than the others. Thus, we can use standard deviation to see how far he is above the mean. bottom line. it depends on how you want to use your data. If you think it is small, it is small. if you think it is big, it i