range of estimates for an unknown parameter
I am so very confused by Confidence Intervals
probability - Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean? - Cross Validated
[Q] How to interpret a confidence interval
[Q] Confidence Interval: confidence of what?
What Does a Confidence Interval Reveal?
So, if we have a 95% confidence interval for the average height of all 16-year-olds as 5'4" to 5'8", we're saying we're 95% confident that the true average height for all 16-year-olds is somewhere between 5'4" and 5'8".
It doesn't mean all heights are equally likely, just that the true average probably falls in this range. It's a way to show our uncertainty in estimates.
Is The confidence interval the same as standard deviation?
A confidence interval, on the other hand, is a range that we're pretty sure (like 95% sure) contains the true average grade for all classes, based on our class. It's about our certainty in estimating a true average, not about individual differences.
Does a boxplot show confidence intervals?
While not a traditional feature, adding confidence intervals can give more insight into the data's reliability of central tendency estimates.
Videos
As the title says, I'm so confused by the concept. I've read so many explanations for the concept for the past few hours and I'm even confused than when I started, because a lot of the explanations seem to be contradictory.
R-bloggers states:
It is not the probability that the true value is in the confidence interval.
We are not 95% sure that the true value lies within the interval. (to me this means that we can't say with 95% confidence that the true value lies within the interval)
Here'sn example of several comments I've read that support these statements:
u/TokenStraightFriend
" Building off that because I only recently came to grips with what exactly "95% confident" means. It does NOT mean that there is a 95% chance that the true population average is within that range. Instead, if we were to repeat our sample taking, measuring, and averaging, we expect for 95% of the time the average height we find will be within that range we predescribed. "
Yet other comments contradict this
"So let's say you want to be 95% confident, so mostly certain, but with just a small degree of uncertainty. Then z=1.95, so we can say that the average population height is somewhere between 69-3(1.95) and 69+3(1.95) inches tall"
Is that not directly contradictory to what R-blogger states?
Here's an explanation from Eberly College of Science:
"
Rather than using just a point estimate, we could find an interval (or range) of values that we can be really confident contains the actual unknown population parameter. For example, we could find lower (L) and upper (U) values between which we can be really confident the population mean falls:
L < ฮผ < U
And, we could find lower (L) and upper (U) values between which we can be really confident the population proportion falls:
L < p < U
"
Notice they say population and not sample. The distinction is made super clear in the Eberly college example.
I keep reading this idea that if you were to construct an infinite number of confidence intervals at a single confidence level 95%, 95% of those intervals may contain the true value for the parameter. That sort of explains what a confidence level is to me, but I don't understand when someone tells me 'this specific confidence interval has a confidence level of 95%'.
Part of the issue is that the frequentist definition of a probability doesn't allow a nontrivial probability to be applied to the outcome of a particular experiment, but only to some fictitious population of experiments from which this particular experiment can be considered a sample. The definition of a CI is confusing as it is a statement about this (usually) fictitious population of experiments, rather than about the particular data collected in the instance at hand. So part of the issue is one of the definition of a probability: The idea of the true value lying within a particular interval with probability 95% is inconsistent with a frequentist framework.
Another aspect of the issue is that the calculation of the frequentist confidence doesn't use all of the information contained in the particular sample relevant to bounding the true value of the statistic. My question "Are there any examples where Bayesian credible intervals are obviously inferior to frequentist confidence intervals" discusses a paper by Edwin Jaynes which has some really good examples that really highlight the difference between confidence intervals and credible intervals. One that is particularly relevant to this discussion is Example 5, which discusses the difference between a credible and a confidence interval for estimating the parameter of a truncated exponential distribution (for a problem in industrial quality control). In the example he gives, there is enough information in the sample to be certain that the true value of the parameter lies nowhere in a properly constructed 90% confidence interval!
This may seem shocking to some, but the reason for this result is that confidence intervals and credible intervals are answers to two different questions, from two different interpretations of probability.
The confidence interval is the answer to the request: "Give me an interval that will bracket the true value of the parameter in $100p$% of the instances of an experiment that is repeated a large number of times." The credible interval is an answer to the request: "Give me an interval that brackets the true value with probability $p$ given the particular sample I've actually observed." To be able to answer the latter request, we must first adopt either (a) a new concept of the data generating process or (b) a different concept of the definition of probability itself.
The main reason that any particular 95% confidence interval does not imply a 95% chance of containing the mean is because the confidence interval is an answer to a different question, so it is only the right answer when the answer to the two questions happens to have the same numerical solution.
In short, credible and confidence intervals answer different questions from different perspectives; both are useful, but you need to choose the right interval for the question you actually want to ask. If you want an interval that admits an interpretation of a 95% (posterior) probability of containing the true value, then choose a credible interval (and, with it, the attendant conceptualization of probability), not a confidence interval. The thing you ought not to do is to adopt a different definition of probability in the interpretation than that used in the analysis.
Thanks to @cardinal for his refinements!
Here is a concrete example, from David MaKay's excellent book "Information Theory, Inference and Learning Algorithms" (page 464):
Let the parameter of interest be $\theta$ and the data $D$, a pair of points $x_1$ and $x_2$ drawn independently from the following distribution:
$p(x|\theta) = \left\{\begin{array}{cl} 1/2 & x = \theta,\\1/2 & x = \theta + 1, \\ 0 & \mathrm{otherwise}\end{array}\right.$
If $\theta$ is $39$, then we would expect to see the datasets $(39,39)$, $(39,40)$, $(40,39)$ and $(40,40)$ all with equal probability $1/4$. Consider the confidence interval
$[\theta_\mathrm{min}(D),\theta_\mathrm{max}(D)] = [\mathrm{min}(x_1,x_2), \mathrm{max}(x_1,x_2)]$.
Clearly this is a valid 75% confidence interval because if you re-sampled the data, $D = (x_1,x_2)$, many times then the confidence interval constructed in this way would contain the true value 75% of the time.
Now consider the data $D = (29,29)$. In this case the frequentist 75% confidence interval would be $[29, 29]$. However, assuming the model of the generating process is correct, $\theta$ could be 28 or 29 in this case, and we have no reason to suppose that 29 is more likely than 28, so the posterior probability is $p(\theta=28|D) = p(\theta=29|D) = 1/2$. So in this case the frequentist confidence interval is clearly not a 75% credible interval as there is only a 50% probability that it contains the true value of $\theta$, given what we can infer about $\theta$ from this particular sample.
Yes, this is a contrived example, but if confidence intervals and credible intervals were not different, then they would still be identical in contrived examples.
Note the key difference is that the confidence interval is a statement about what would happen if you repeated the experiment many times, the credible interval is a statement about what can be inferred from this particular sample.
In frequentist statistics probabilities are about events in the long run. They just don't apply to a single event after it's done. And the running of an experiment and calculation of the CI is just such an event.
You wanted to compare it to the probability of a hidden coin being heads but you can't. You can relate it to something very close. If your game had a rule where you must state after the flip "heads" then the probability you'll be correct in the long run is 50% and that is analogous.
When you run your experiment and collect your data then you've got something similar to the actual flip of the coin. The process of the experiment is like the process of the coin flipping in that it generates $\mu$ or it doesn't just like the coin is heads or it's not. Once you flip the coin, whether you see it or not, there is no probability that it's heads, it's either heads or it's not. Now suppose you call heads. That's what calculating the CI is. Because you can't ever reveal the coin (your analogy to an experiment would vanish). Either you're right or you're wrong, that's it. Does it's current state have any relation to the probability of it coming up heads on the next flip, or that I could have predicted what it is? No. The process by which the head is produced has a 0.5 probability of producing them but it does not mean that a head that already exists has a 0.5 probability of being. Once you calculate your CI there is no probability that it captures $\mu$, it either does or it doesn'tโyou've already flipped the coin.
OK, I think I've tortured that enough. The critical point is really that your analogy is misguided. You can never reveal the coin; you can only call heads or tails based on assumptions about coins (experiments). You might want to make a bet afterwards on your heads or tails being correct but you can't ever collect on it. Also, it's a critical component of the CI procedure that you're stating the value of import is in the interval. If you don't then you don't have a CI (or at least not one at the stated %).
Probably the thing that makes the CI confusing is it's name. It's a range of values that either do or don't contain $\mu$. We think they contain $\mu$ but the probability of that isn't the same as the process that went into developing it. The 95% part of the 95% CI name is just about the process. You can calculate a range that you believe afterwards contains $\mu$ at some probability level but that's a different calculation and not a CI.
It's better to think of the name 95% CI as a designation of a kind of measurement of a range of values that you think plausibly contain $\mu$ and separate the 95% from that plausibility. We could call it the Jennifer CI while the 99% CI is the Wendy CI. That might actually be better. Then, afterwards we can say that we believe $\mu$ is likely to be in the range of values and no one would get stuck saying that there is a Wendy probability that we've captured $\mu$. If you'd like a different designation I think you should probably feel free to get rid of the "confidence" part of CI as well (but it is an interval).