why is the population standard deviation the square root of the sum of the (values - means)^2 ÷ n , while the sample standard deviation is all that over n - 1? I don't understand why you have to subtract 1 from the number of things.
So I have 12 samples that I've tested for their thermal conductivity for a chemistry lab and want to compute their standard deviation. I've read online about the difference between population vs sample SD and it seems you only use population SD when you've tested the entire population, not just a portion of it. I'm not sure what it means by population though. Would my 12 samples count as the entire population?
Videos
Hi,
(this isn't a homework question, don't worry)
I'm taking a college stats class right now and I'm confused about a particular concept with the two types of standard deviation: (our book doesn't explain this at all, it just gives the formulas)
There are formulas for both sample standard deviation (s) and population standard deviation (sigma). If all you have is a sample of let's say 5 numbers, how does the population version "know" what the overall population actually is to come up with this value? I don't understand the meaning behind that particular stat is because of this.
Example: I enter 13,24,12,44,55 into my calculator and compute the 1-variable stats:
s = 19.16507 sigma = 17.14176
What "population" is it assuming those 5 numbers come from?
Thanks, appreciate any explanations!
I always get confused when to use sample or population std dev. Can someone give me any trick to remember it?
A fund had the following experience over the past 10 years:
YearReturn
4.5% 6.0% 1.5% −2.0% 0.0% 4.5% 3.5% 2.5% 5.5% 4.0%
Q. The standard deviation of the 10 years of returns is closest to:
-
2.40%.
-
2.53%.
-
7.58%.
There is no mention in this question whether to calculate the sample or the population Standard deviation. Which one should we calculate by default ?
Hi all,
When calculating the standard deviation for a population vs a sample of this population we use N-1 on the demoninator for the sample but N for the population. My understanding of the reasoning behind this is do to with the statistical inference that we would eventually make from this sample onto the population. Is this correct? Can anyone unpack this a bit more?
Further, is the reasoning to do with something like the actual population having more outliers than a typical sample would have and having skewing the figures of the mean?
Thanks in advance
Hi all, I know this type of question has been asked a lot here, but I haven’t found the answers exactly satisfactory, so I’m creating my own post.
We were given these two tables seen in the image below: https://ibb.co/TvVRPJB
Then, we were asked “Find the mean and standard deviation of each set of data by using class centres.”
I initially assumed they were asking for population standard deviation, as each of these tables is an individual set of data (and the question asks for “standard deviation of each set”). However, the answer sheet indicates that I should have used sample standard deviation.
To me, each of these sets is a discrete full population, and standard deviation of “each set” sounds like I should find deviation of the individual set (population) not of the whole set of data collected (sample).
Is this just a confusingly worded question, or am I completely incorrect? If I am, please clarify where this confusion is coming from.
Thank you very much for any help.
Sorry for the x-post from r/statistics, but I just realized this forum existed.
I am relatively untrained, so, this is partially a "how do I do this calculation" question, but also a bit of a "how do I show the meaning of this calculation to someone who isn't really a 'math person'" question.
I'm trying to find a way to express (to someone relatively unversed in statistics, not that I'm much beyond a novice) how a sample's result differs from the expected outcome given a known population mean and SD. I was hoping that within Sheets/Excel I could create a normal curve wherein I could show where the sample results fall along the population curve. But I'm not even quite sure how to create a curve within those parameters to begin with.
Say I am playing a betting game, where I know that the expected value of any bet (x) is .2x. And the SD of that population is .5x. My sample mean is -1.5x, which I know just barely falls outside 3 standard deviations of the known population mean. Is there a way to compare the sample to the population? Am I even thinking about this things in the right way?
I'm a bit confused about whether I should be using the standard deviation of the sample or population when accounting for propagation of uncertainty. One can convert between the two using \sigma_P^2 = \frac{n-1}{n} \sigma_S^2 Obviously, when using this conversion from the sample to population the conversion relies on the assumption that the population is being well & accurately sampled, but if that assumption holds and you know the size of your sample then you know the population standard deviation. [False.]
To further contextualize this, I'm writing a script with a class called Measure that does propagation of uncertainty for me if I do math with it. I've decided it's most appropriate to convert stranger measures of uncertainty into standard deviation since propagation of uncertainty isn't really built for margin of error. My issues are 1) whether I should convert to population standard deviation when the sample size is given, 2) whether the different standard deviations can be used together in propagation of uncertainty, 3) which type of standard deviation is assumed when giving the uncertainty of a measurement.
Say I have a population with a normal distribution and it has a mean of 1500. In this population 2400 is at twice the standard deviation above the mean.
2.2% of the population would be above 2400, correct?
If I took a random sample of the population the sample would have the same mean and SD, right? The percentage above 2400 would remain the same at (about) 2.2%?
Basic question, but it's been a long time since my single quarter of statistics.
Hi all,
Would it be possible to advise in which scenario one would use a sample standard deviation and which scenario one would use a population standard deviation.
For example, if a mutual fund has 10 funds. Would the standard deviation be computed by using sample or population standard deviation.
Thank you!
If you do not wish to generalize your results outside of these 10 funds, as in compare them with other funds, you may use population standard deviation. Otherwise use sample std.
Would the standard deviation be computed by using sample or population standard deviation.
The standard deviation of . . . what, exactly?
Can anyone tell me when I should use the sample standard deviation vs the population standard deviation in a practice question for Quants?
According to my textbook, “A population parameter is a fixed but unknown characteristic that describes an entire population.”
A population parameter sounds like a population standard deviation then, since it describes how an entire population varies from the mean. So how are they different?
I've taken a look at an earlier eli5 but am still very confused.
When you compute the standard deviation from a sample, you almost always have to compute it "around" the observed mean of the sample (not the true mean of the population) because the true mean of the population is unknown. The difference between the observed mean and the true mean causes a bias in the standard deviation which can be corrected by using a different formula.
You may have figured all this out already, but here's something that tripped me up for a very long time when I first learned statistics (and I'm hardly an expert, so take this with a grain of salt):
The reason you take a sample is to estimate parameters (mean and stdev, usually) of the distribution that describes the whole population from which the sample is drawn. The formula for "population standard deviation" is the definition of standard deviation of a statistical distribution. The formula for "sample standard deviation" is just a way to estimate the standard deviation of the population distribution given a sample drawn from the distribution. They are different formulas because they do completely different things. One estimates the other based on a subset of the data (the sample).
There is no difference in appearance between the "population mean" formula and the "sample mean" formula, but there is still a difference in interpretation: the former is the definition of "mean" of a statistical distribution, while the latter tells you how to figure out (approximately) what this mean is from only your sample.
The fact that the "population" vs "sample" mean formulae do not differ in appearance is perhaps even more surprising than the fact that the "population" vs "sample" stdev formulae do differ. But the other answer does a pretty good job of explaining why this is the case.
Hello,
taking a statistics course and I am a bit confused on how this relationship work.
if the same size is 100 vs a sample size of 1000 it means that the mean would be smaller or larger?
How would the standard deviation be affected?
What is the probability that either samples has the lowest variable sampled?
Thank you