Your question starts out as if the statistical null and alternative hypotheses are what you are interested in, but the penultimate sentence makes me think that you might be more interested in the difference between scientific and statistical hypotheses.
Statistical hypotheses can only be those that are expressible within a statistical model. They typically concern values of parameters within the statistical model. Scientific hypotheses almost invariably concern the real world, and they often do not directly translate into the much more limited universe of the chosen statistical model. Few introductory stats books spend any real time considering what constitutes a statistical model (it can be very complicated) and the trivial examples used have scientific hypotheses so simple that the distinction between model and real-world hypotheses is blurry.
I have written an extensive account of hypothesis and significance testing that includes several sections dealing with the distinction between scientific and statistical hypotheses, as well as the dangers that might come from assuming a match between the statistical model and the real-world scientific concerns: A Reckless Guide to P-values
So, to answer your explicit questions:
• No, statisticians do not always use null and alternative hypotheses. Many statistical methods do not require them.
• It is common practice in some disciplines (and maybe some schools of statistics) to specify the null and alternative hypothesis when a hypothesis test is being used. However, you should note that a hypotheses test requires an explicit alternative for the planning stage (e.g. for sample size determination) but once the data are in hand that alternative is no longer relevant. Many times the post-data alternative can be no more than 'not the null'.
• I'm not sure of the mental heuristic thing, but it does seem possible to me that the beginner courses omit so much detail in the service of simplicity that the word 'hypothesis' loses its already vague meaning.
Answer from Michael Lew on Stack ExchangeVideos
What are null and alternative hypotheses?
What is hypothesis testing?
What’s the difference between a research hypothesis and a statistical hypothesis?
Your question starts out as if the statistical null and alternative hypotheses are what you are interested in, but the penultimate sentence makes me think that you might be more interested in the difference between scientific and statistical hypotheses.
Statistical hypotheses can only be those that are expressible within a statistical model. They typically concern values of parameters within the statistical model. Scientific hypotheses almost invariably concern the real world, and they often do not directly translate into the much more limited universe of the chosen statistical model. Few introductory stats books spend any real time considering what constitutes a statistical model (it can be very complicated) and the trivial examples used have scientific hypotheses so simple that the distinction between model and real-world hypotheses is blurry.
I have written an extensive account of hypothesis and significance testing that includes several sections dealing with the distinction between scientific and statistical hypotheses, as well as the dangers that might come from assuming a match between the statistical model and the real-world scientific concerns: A Reckless Guide to P-values
So, to answer your explicit questions:
• No, statisticians do not always use null and alternative hypotheses. Many statistical methods do not require them.
• It is common practice in some disciplines (and maybe some schools of statistics) to specify the null and alternative hypothesis when a hypothesis test is being used. However, you should note that a hypotheses test requires an explicit alternative for the planning stage (e.g. for sample size determination) but once the data are in hand that alternative is no longer relevant. Many times the post-data alternative can be no more than 'not the null'.
• I'm not sure of the mental heuristic thing, but it does seem possible to me that the beginner courses omit so much detail in the service of simplicity that the word 'hypothesis' loses its already vague meaning.
You wrote
the declaration of a null and alternative hypothesis is the "first step" of any good experiment and subsequent analysis.
Well, you did put quotes around first step, but I'd say the first step in an experiment is figuring out what you want to figure out.
As to "subsequent analysis", it might even be that the subsequent analysis does not involve testing a hypothesis! Maybe you just want to estimate a parameter. Personally, I think tests are overused.
Often, you know in advance that the null is false and you just want to see what is actually going on.
The choice of null and alternative hypothesis depends on which claim requires the burden of proof.
A commonly used analogy comes from the legal doctrine of "innocent until proven guilty." In a jury trial, defendants are presumed innocent, and are only convicted if there is guilt "beyond a reasonable doubt." This means the evidence weighing in favor of guilt must be so overwhelming that it is highly implausible to a reasonable person that the defendant could still be innocent in light of the evidence presented.
The rationale for this doctrine is that it is a far greater miscarriage of justice to wrongly convict an innocent person than it is to fail to convict the guilty on the basis of inadequate evidence of guilt. Consequently, such a legal system would prefer to reduce the likelihood of the former than the latter.
In hypothesis testing, a similar notion applies to the null and alternative hypotheses, with the null being "innocence" and the alternative being "guilt." Therefore, which claim is the null and which is the alternative depends on which one, if true, requires the data to show with a high degree of confidence. Correspondingly, the only conclusions that are possible in a statistical hypothesis test are:
- Reject the null hypothesis (i.e., accept the alternative)
- Fail to reject the null hypothesis (i.e., the evidence is inconclusive).
It is a common misconception to characterize the second conclusion as being equivalent to "accepting the null hypothesis." This is wrong because the entire test is performed under the presumption of the null, just as a jury trial is held under the presumption of innocence: the prosecutor will argue "if the defendant is innocent, then how could all the evidence point to their guilt?"
To further extend the analogy, a conclusion in which we reject the null hypothesis in error is analogous to a defendant being wrongfully convicted. This error is called Type I error, and we design the test in such a way as to limit the probability of making such an error to not exceed some predefined value called , the significance level of the test. Conversely, the failure to reject the null when the alternative is true, is called a Type II error, and is analogous to the failure to convict a guilty defendant.
With all of this in mind, we now turn our attention to your specific question. Here, you are presented with a scenario in which we very clearly want to conclusively demonstrate that the water is safe--in other words, the penalty for erroneously claiming the water is safe when it is not, is quite a lot higher than the cost of claiming the water is unsafe when it is. Consider that many people could be severely poisoned or permanently harmed in the former case. So the former error is the one that must be tightly controlled. Consequently, the claim that requires the burden of proof is the claim that the water is safe. This must be your alternative hypothesis. Therefore, the correct structure of the hypothesis test is
Under the assumption of the null being true, the choice of test statistic will be based on a null mean of , since this is the choice that will ensure that the Type I error of the test will not exceed the significance level--i.e., it is the most conservative and ensures that if we reject
, we do so in error with probability at most
.
In NHST, at the end of the day, you will either have statistical evidence for the alternative hypothesis, or nothing. So the alternative hypothesis is better something you are interested in showing!
If you want to be able to tell people with confidence: "Yes, you can drink this water", choose . If you want to be able to tell the city with confidence "You have a mercury problem", choose
. An appropriate null-hypothesis to be tested against is
in both cases.
The text "You are responsible determining whether drinking water in a given city is safe" indicates that you want to be able to tell people with confidence: "Yes, you can drink this water", and you choose .
The rule for the proper formulation of a hypothesis test is that the alternative or research hypothesis is the statement that, if true, is strongly supported by the evidence furnished by the data.
The null hypothesis is generally the complement of the alternative hypothesis. Frequently, it is (or contains) the assumption that you are making about how the data are distributed in order to calculate the test statistic.
Here are a few examples to help you understand how these are properly chosen.
Suppose I am an epidemiologist in public health, and I'm investigating whether the incidence of smoking among a certain ethnic group is greater than the population as a whole, and therefore there is a need to target anti-smoking campaigns for this sub-population through greater community outreach and education. From previous studies that have been published in the literature, I find that the incidence among the general population is $p_0$. I can then go about collecting sample data (that's actually the hard part!) to test $$H_0 : p = p_0 \quad \mathrm{vs.} \quad H_a : p > p_0.$$ This is a one-sided binomial proportion test. $H_a$ is the statement that, if it were true, would need to be strongly supported by the data we collected. It is the statement that carries the burden of proof. This is because any conclusion we draw from the test is conditional upon assuming that the null is true: either $H_a$ is accepted, or the test is inconclusive and there is insufficient evidence from the data to suggest $H_a$ is true. The choice of $H_0$ reflects the underlying assumption that there is no difference in the smoking rates of the sub-population compared to the whole.
Now suppose I am a researcher investigating a new drug that I believe to be equally effective to an existing standard of treatment, but with fewer side effects and therefore a more desirable safety profile. I would like to demonstrate the equal efficacy by conducting a bioequivalence test. If $\mu_0$ is the mean existing standard treatment effect, then my hypothesis might look like this: $$H_0 : |\mu - \mu_0| \ge \Delta \quad \mathrm{vs.} \quad H_a : |\mu - \mu_0| < \Delta,$$ for some choice of margin $\Delta$ that I consider to be clinically significant. For example, a clinician might say that two treatments are sufficiently bioequivalent if there is less than a $\Delta = 10\%$ difference in treatment effect. Note again that $H_a$ is the statement that carries the burden of proof: the data we collect must strongly support it, in order for us to accept it; otherwise, it could still be true but we don't have the evidence to support the claim.
Now suppose I am doing an analysis for a small business owner who sells three products $A$, $B$, $C$. They suspect that there is a statistically significant preference for these three products. Then my hypothesis is $$H_0 : \mu_A = \mu_B = \mu_C \quad \mathrm{vs.} \quad H_a : \exists i \ne j \text{ such that } \mu_i \ne \mu_j.$$ Really, all that $H_a$ is saying is that there are two means that are not equal to each other, which would then suggest that some difference in preference exists.
The null hypothesis is nearly always "something didn't happen" or "there is no effect" or "there is no relationship" or something similar. But it need not be this.
In your case, the null would be "there is no relationship between CRM and performance"
The usual method is to test the null at some significance level (most often, 0.05). Whether this is a good method is another matter, but it is what is commonly done.