There are, in fact, two different formulas for standard deviation here: The population standard deviation and the sample standard deviation
.
If denote all
values from a population, then the (population) standard deviation is
where
is the mean of the population.
If denote
values from a sample, however, then the (sample) standard deviation is
where
is the mean of the sample.
The reason for the change in formula with the sample is this: When you're calculating you are normally using
(the sample variance) to estimate
(the population variance). The problem, though, is that if you don't know
you generally don't know the population mean
, either, and so you have to use
in the place in the formula where you normally would use
. Doing so introduces a slight bias into the calculation: Since
is calculated from the sample, the values of
are on average closer to
than they would be to
, and so the sum of squares
turns out to be smaller on average than
. It just so happens that that bias can be corrected by dividing by
instead of
. (Proving this is a standard exercise in an advanced undergraduate or beginning graduate course in statistical theory.) The technical term here is that
(because of the division by
) is an unbiased estimator of
.
Another way to think about it is that with a sample you have independent pieces of information. However, since
is the average of those
pieces, if you know
, you can figure out what
is. So when you're squaring and adding up the residuals
, there are really only
independent pieces of information there. So in that sense perhaps dividing by
rather than
makes sense. The technical term here is that there are
degrees of freedom in the residuals
.
For more information, see Wikipedia's article on the sample standard deviation.
Answer from Mike Spivey on Stack ExchangeVideos
What Is Population Standard Deviation?
The population standard deviation measures the variability of data in a population. It is usually an unknown constant. σ (Greek letter sigma) is the symbol for the population standard deviation.
What Is The Formula of Population Standard Deviation?
The following is the population standard deviation formula:
Where:
σ = population standard deviation
x1, ..., xN = the population data set
μ = mean of the population data set
N = size of the population data set
There are, in fact, two different formulas for standard deviation here: The population standard deviation and the sample standard deviation
.
If denote all
values from a population, then the (population) standard deviation is
where
is the mean of the population.
If denote
values from a sample, however, then the (sample) standard deviation is
where
is the mean of the sample.
The reason for the change in formula with the sample is this: When you're calculating you are normally using
(the sample variance) to estimate
(the population variance). The problem, though, is that if you don't know
you generally don't know the population mean
, either, and so you have to use
in the place in the formula where you normally would use
. Doing so introduces a slight bias into the calculation: Since
is calculated from the sample, the values of
are on average closer to
than they would be to
, and so the sum of squares
turns out to be smaller on average than
. It just so happens that that bias can be corrected by dividing by
instead of
. (Proving this is a standard exercise in an advanced undergraduate or beginning graduate course in statistical theory.) The technical term here is that
(because of the division by
) is an unbiased estimator of
.
Another way to think about it is that with a sample you have independent pieces of information. However, since
is the average of those
pieces, if you know
, you can figure out what
is. So when you're squaring and adding up the residuals
, there are really only
independent pieces of information there. So in that sense perhaps dividing by
rather than
makes sense. The technical term here is that there are
degrees of freedom in the residuals
.
For more information, see Wikipedia's article on the sample standard deviation.
Answer from Mike Spivey on Stack Exchange