statistical concept
Videos
What is hypothesis testing?
What symbols are used to represent null hypotheses?
Whatβs the difference between a research hypothesis and a statistical hypothesis?
A rule of the thumb from a good advisor of mine was to set the Null-Hypothesis to the outcome you do not want to be true i.e. the outcome whose direct opposite you want to show.
Basic example: Suppose you have developed a new medical treatment and you want to show that it is indeed better than placebo. So you set Null-Hypothesis new treament is equal or worse than placebo and Alternative Hypothesis
new treatment is better than placebo.
This because in the course of a statistical test you either reject the Null-Hypothesis (and favor the Alternative Hypothesis) or you cannot reject it. Since your "goal" is to reject the Null-Hypothesis you set it to the outcome you do not want to be true.
Side Note: I am aware that one should not set up a statistical test to twist it and break it until the Null-Hypothesis is rejected, the casual language was only used to make this rule easier to remember.
This also may be helpful: What is the meaning of p values and t values in statistical tests? and/or What is a good introduction to statistical hypothesis testing for computer scientists?
If hypothesis B is the interesting hypothesis you can take not-B as the null hypothesis and control, under the null, the probability of the type I error for wrongly rejecting not-B at level . Rejecting not-B is then interpreted as evidence in favor of B because we control the type I error, hence it is unlikely that not-B is true. Confused ... ?
Take the example of treatment vs. no treatment in two groups from a population. The interesting hypothesis is that treatment has an effect, that is, there is a difference between the treated group and the untreated group due to the treatment. The null hypothesis is that there is no difference, and we control the probability of wrongly rejecting this hypothesis. Thus we control the probability of wrongly concluding that there is a treatment effect when there is no treatment effect. The type II error is the probability of wrongly accepting the null when there is a treatment effect.
The formulation above is based on the Neyman-Pearson framework for statistical testing, where statistical testing is seen as a decision problem between to cases, the null and the alternative. The level is the fraction of times we make a type I error if we (independently) repeat the test. In this framework there is really not any formal distinction between the null and the alternative. If we interchange the null and the alternative, we interchange the probability of type I and type II errors. We did not, however, control the type II error probability above (it depends upon how big the treatment effect is), and due to this asymmetry, we may prefer to say that we fail to reject the null hypothesis (instead of that we accept the null hypothesis). Thus we should be careful about concluding that the null hypothesis is true just because we can't reject it.
In a Fisherian significance testing framework there is really only a null hypothesis and one computes, under the null, a -value for the observed data. Smaller
-values are interpreted as stronger evidence against the null. Here the null hypothesis is definitely not-B (no effect of treatment) and the
-value is interpreted as the amount of evidence against the null. With a small
-value we can confidently reject the null, that there is no treatment effect, and conclude that there is a treatment effect. In this framework we can only reject or not reject (never accept) the null, and it is all about falsifying the null. Note that the
-value does not need to be justified by an (imaginary) repeated number of decisions.
Neither framework is without problems, and the terminology is often mixed up. I can recommend the book Statistical evidence: a likelihood paradigm by Richard M. Royall for a clear treatment of the different concepts.