statistical property
Videos
Whatโs the difference between standard error and standard deviation?
What is standard error?
Whatโs the difference between a point estimate and an interval estimate?
Would any kind soul provide me with an example to try understand it?
To complete the answer to the question, Ocram nicely addressed standard error but did not contrast it to standard deviation and did not mention the dependence on sample size. As a special case for the estimator consider the sample mean. The standard error for the mean is where
is the population standard deviation. So in this example we see explicitly how the standard error decreases with increasing sample size. The standard deviation is most often used to refer to the individual observations. So standard deviation describes the variability of the individual observations while standard error shows the variability of the estimator. Good estimators are consistent which means that they converge to the true parameter value. When their standard error decreases to 0 as the sample size increases the estimators are consistent which in most cases happens because the standard error goes to 0 as we see explicitly with the sample mean.
Here is a more practical (and not mathematical) answer:
- The SD (standard deviation) quantifies scatter โ how much the values vary from one another.
- The SEM (standard error of the mean) quantifies how precisely you know the true mean of the population. It takes into account both the value of the SD and the sample size.
- Both SD and SEM are in the same units -- the units of the data.
- The SEM, by definition, is always smaller than the SD.
- The SEM gets smaller as your samples get larger. This makes sense, because the mean of a large sample is likely to be closer to the true population mean than is the mean of a small sample. With a huge sample, you'll know the value of the mean with a lot of precision even if the data are very scattered.
- The SD does not change predictably as you acquire more data. The SD you compute from a sample is the best possible estimate of the SD of the overall population. As you collect more data, you'll assess the SD of the population with more precision. But you can't predict whether the SD from a larger sample will be bigger or smaller than the SD from a small sample. (This is a simplification, not quite true. See comments below.)
Note that standard errors can be computed for almost any parameter you compute from data, not just the mean. The phrase "the standard error" is a bit ambiguous. The points above refer only to the standard error of the mean.
(From the GraphPad Statistics Guide that I wrote.)
Absolutely it is not just a question of honesty, or anything to do with stats v measurement error. The standard error of the mean and the standard deviation of the population are two different things.
The mean of your sample is a random variable, because it would be different every time you ran the sampling process. The sampling error of the mean is just the estimated standard deviation of the sample mean.
It's not quite clear what you mean by comparing data points to your model. But if you mean you are interested in whether a particular data point is plausibly from the population you have modelled (eg to ask "is this number a really big outlier?), you need to compare it to your estimate of the population mean and your estimate of the population standard deviation (not the sample mean's standard deviation, also known as SEM). So the standard deviation in this case.
More generally, it sounds like you are using the standard deviation inappropriately in some other circumstances. If you are trying to report inferences about the population mean you should use the sample mean's standard deviation / standard error, not the population standard deviation.
I think of it this way; the SEM is a measure of the precision of a sample mean. But a sample mean from one experiment (say, with 3 -6 replicates) is hardly enough to gauge the precision of a population mean. FOr me, one needs to do multiple independent experiments (each with replicates to generate a mean) and then each of these means are taken as individual n's. So, instead of individual samples (replicates from an experiment), I only use the individual means of several (at least 3, more often 4-6) independent experiments to calculate the SEM and I use the STD of those averaged means in the calculation of an SEM. SInce few (including me) perform experiments more than 3 times, I elect to use the STD of a "representative" experiment. I think the SEM is not very useful and most people use it simply to reduce the size of the error bar. Therefore, I do not like the SEM much and loathe its pervasive use in science.