People use instead of
because in hypothesis testing we want to know how likely it is to observe the current sample assuming that the null hypothesis is true. If the null hypothesis is true, we know the standard deviation in case of a Binomial experiment, and we should use it!
On the other hand, I imagine the you are referring comes from doing doing hypothesis tests on normal distributions (or some unknown distribution and applying the CLT) where we want to test for the mean (
) but we don't know the standard deviation (
). Hence, we have no
to plug in, and have no choice but to use an estimate for the standard deviation,
.
Indeed, if is known, one should use it over
just like here we use
instead of
.
Im doing some review for my Stats exam coming up. In my notes I wrote down that p^ (p hat) is the expected value while pO is the "observed" or actual value.
So if I were to say the "expected" number of successes is sample size n times p^ or n(p) is that correct? Because on the answer key it said the "expected number of successes" is n(pO) which translates to sample size n times the OBSERVED value.
EDIT: Did some more digging. I looked through the conditions needed to make a inference about a proportion.
One of the conditions was: Ho: pO, n(pO) and n(1-pO) had to be greater than or equal to 10 in order to do a significance test. For the confidenc interval you switch the pO for p.
Videos
People use instead of
because in hypothesis testing we want to know how likely it is to observe the current sample assuming that the null hypothesis is true. If the null hypothesis is true, we know the standard deviation in case of a Binomial experiment, and we should use it!
On the other hand, I imagine the you are referring comes from doing doing hypothesis tests on normal distributions (or some unknown distribution and applying the CLT) where we want to test for the mean (
) but we don't know the standard deviation (
). Hence, we have no
to plug in, and have no choice but to use an estimate for the standard deviation,
.
Indeed, if is known, one should use it over
just like here we use
instead of
.
In the second formula, you are using the normal approximation which can be used when the number of trials is large. For normally distributed variables, the mean and variance are independent of each other, so you have to use the estimated variance below since you do not know it. This is also why the normal distribution has two parameters.
However, with a binomially distributed variable, the mean and variance are not independent. This is why it has only one parameter. Under the null that , this pins down the variance as
, so you can use that instead.
In the frequentist tradition (which is what you are using here) the random variable is the data. The population parameters are mathematically treated as constant. This is what leads to the somewhat counterintuitive "null hypothesis" setup we use in intro statistics, because the probability we return (usually in the form of a p-value) is a probability on the sample given constant population parameter set at the null hypothesis values.
I would imagine this is why you see the notation you do in introductory many textbooks.
I don't see a $\hat{p}$ in the figure you posted, but from the formula in the figure, $p_1$ and $p_2$ are statistics. Once you calculate a statistic, it becomes a realization of the random variable (Be aware that I am not saying that your statistic is the true population parameter).
Above all, remember that in most cases upper/lower cases are conventions. They might be widespread, which can be helpful in many cases, but there is no law that forces you to write a random variable's "name" in uppercase. It's common for introductory (and even advanced) books to have a discussion on symbols and style. That section will help you understand the notation the author(s) has adopted.
Both questions are essentially applications of the Central Limit Theorem, which says (informally) that "the value of a sum over many samples from a common population will tend to a normal distribution as the number of samples becomes large".
The two questions differ in the type of data that they treat. The "xbar" question concerns temperature, which is a continuous measurement (e.g. a decimal number). The "phat" question implicitly concerns a binary measurement (true/false, e.g. each student either invests or does not).
Commonly a measurement of a random variable will be denoted by $x$. For a random sample $x_1,\ldots,x_N$ the sample mean will then be denoted by $\bar{x}=\frac{1}{N}\sum_ix_i$. This applies directly to the "xbar" question. Here each $x_i$ is a temperature measurement, and the question asks about the sampling distribution of $\bar{x}$. (This arises when $\bar{x}$ is computed many times over different samples, each of size $N$).
For the "phat" question, the notation and logic is consistent with this, but the connection is a little more involved. In this case each $x_i$ will correspond to an individual student, who either invests ($x=1$) or does not ($x=0$). The probability that a student will invest would commonly be denoted by $p$ ($=30\%$ in this case). These conventions of $\Pr[\text{true}]=p$ and $\{\text{true,false}\}=\{1,0\}$ are standard for the case of a binary random variable.
Now imagine we do not know the value of $p$, but wish to estimate it from a random sample of students $x_1,\ldots,x_N$. For a single student the expected value of $x_i$ is $p$, denoted $\mathbb{E}[x]=p$ (see also here). Similarly, by the properties of expectation, for the sample we have $\mathbb{E}[\bar{x}]=p$. So here the sample mean $\bar{x}$ provides an estimate of the population parameter $p$. In statistics it is standard practice to denote an estimate of a population parameter by using a "hat", so here we it makes sense to denote the sample mean as $\hat{p}$.
(For the "xbar" problem the comparable notation would be $\bar{x}=\hat{\mu}$, as there $x$ is normal rather than Bernoulli.)
Below one could be a handy tip. The image clearly distinguishes between sample mean and sample proportions.
Source
Source info: UF Biostatistics Open learning textbook, Module 9, Sampling Distribution of the Sample Mean (in case link dies out in future)