Trouble Understanding Primary/Secondary Hypothesis for Senior Research Methods Course.
Psychology Stats! Hypothesis and Inferential Statistics help!
Wow that's a lot of questions! I recommend finding a good intro to statistics book because you need to reference something when you are in doubt.
Here's the gist of things, in simple terms:
Hypothesis testing is based on the idea that when you analyze your sample data you want to know whether what you find is due to random chance or an actual effect. For example, you give a group of people a meditation course and you see that after the course their anxiety levels dropped by 5 points on average on some scale you used. Is this 5 point decrease random variation or does the meditation course have anything to do with it?
With hypothesis testing, which is just one way to get an answer to your research question, you create a null sampling distribution. Sampling distribution means that you create a distribution of the means of virtually infinite samples from the population. For example, if you want to calculate anxiety, you take one sample and their average anxiety is 21, another and it's 17, another and it's 24, and so on. Imagine you do it infinitely many times and you end up with a nice curve, where the center is the population value, the true anxiety score of all people (which we will never know) and the further away you go you have more unlikely means to come up in a sample. What does this mean? Imagine you draw a sample of 30 people and suppose that the true population anxiety score is 22 out of 60. It is very likely that the mean of your sample will be somewhere around 22. You could, however, end up with a sample whose mean is 3 or 58. It is very unlikely that you randomly select 30 people with really high or low anxiety. It can definitely happen just by chance but it's not going to be common.
Now let's go back to your research. The anxiety of your group before meditation was 28 and after meditation is 23. Was this random chance? You assume that it is, that is your null hypothesis. So you say, my mean of 23 comes from the same sampling distribution as my mean of 28. You use 28 as you sampling distribution mean and use some formulas to calculate the standard error of this distribution. And now the key question: with this distribution, how surprised am I to see a sample mean of 23? How surprised you are depends how far 23 is from the sampling distribution mean. If it's one standard error below you're not very surprised, after all some variation is the norm. If it's 3 standard errors below, well now you are very surprised because while it is possible that your sample comes from this sampling distribution it is very unlikely.
How surprised do you need to be to conclude that maybe your meditation course has something to do with the decrease in anxiety? Enter the p-value, which is no magic number. Historically researchers agreed to set this to. 05, which simply means "if the likelihood of having a sample mean of 23 is less than 5% under my null distribution, then I'm going to assume that it doesn't come from there and it actually comes from an alternative distribution".
A very important caveat to all of this: you can never conclude that the null hypothesis is true or false, nor that the alternative is true or false. The only thing you can conclude is that you reject the null hypothesis (or not) based on the data you found. Often researchers make a jump from here and conclude that the intervention had an effect, but this is based on logic, not statistics.
A t distribution is one type of distribution, it basically had to do with the shape of the curve. There are many distributions and they all have different curves. Cohen's d is a measure of how big an effect is. For example you conclude that the meditation course reduces anxiety, but how much? Cohen's d gives you some idea of whether there is a big or small change.
More on reddit.com