We ran a marketing campaign at work with a control group and a test group. (The test group got a solicitation for a discount of our product; the control group did not.) The results looked something like this
| Group | Converted | Did not convert | | -------- | --------- | ----------------| | Control | 30 | 387 | | Test | 59 | 465 |
So, in this example (not our real data) the test group converted at a 11.3% rate while the control group converted at a 7.2% rate.
My boss wants to make a statement like
The difference is 4.1%. With 95% confidence I think the true difference lies between X% and Y%.
His strategy is to
calculate the standard deviation of each proportion.
ctrl <- 30/(30 + 387) test <- 59/(59 + 465) sqrt(ctrl*(1-ctrl)/(30 + 387)) # 0.01265354 sqrt(test*(1-test)/(59 + 465)) # 0.01380879
calculate the confidence interval of each proportion
ctrl +/- 2 * 0.01265354 <-- 2 standard devs gives 95% conf interval test +/- 2 * 0.01380879
Then he wants to do something like take the difference of the lower bounds and the difference of the upper bounds.. This is where I'm confused and skeptical of his approach.
I know how to calculate the 95% confidence interval by using a simulation, but I suspect there's a formula or R code that'll let me plug and chug.
As a side note, I originally wanted to use the Fisher Exact test for this project, but the confidence interval is reported in terms of an odds ratio and my boss is steadfast on getting a confidence interval for the difference in conversion rates.