Confidence Intervals identify a range of values which have a probability of containing an unknown population value.
It measures the precision with which an estimate from a single sample approximates the population value.
Suppose we have a vat of 10 million jelly beans, and we want to find the proportion of red ones. We could go through all 10 million jelly beans, but a more efficient method would be to take a sample of jelly beans from random areas of the vat and estimate the proportion.
For instance, in a random sample of 1,000 jelly beans, say we find that 50% are red ones. Because the resulting "50%" is based on only some of the jelly beans, it is subject to some uncertainty or error.
The confidence interval states the margin of error. In this example, there is a 95% chance that the result falls within +/- 3.1% of the sample estimate of 50%.
Example |
||
Formula | Explanation | |
---|---|---|
95% c.i. | = 1.96 * ((p * q)/(n-1))**0.5 | p is the observed proportion q = 1-p n = sample size |
= 1.96 * ((.5 * .5)/(1,000 - 1))**0.5 | 1.96 = z-score for 95% confidence interval | |
= 0.031 or 3.1% |
The size of the margin of error depends on the size of the sample, the observed proportion of red jelly beans, and the total number of jelly beans in the vat. First, the margin of error decreases as the sample size gets larger, though not proportionately. For example, if the sample size was doubled to 2,000, the confidence interval would be +/- 2.2%.
Second, the margin of error would be largest if the observed proportion of red jelly beans was 0.5 and smallest if it was 0 or 1.0. This is because the results from the sample have the highest variability if the observed proportion is at 0.5 and the lowest variability at 0 and 1. Think of it this way: if there were no red jelly beans in the vat, all of the samples drawn would have the same result - a proportion of 0. Since there would be no variability in the results, there would be no margin of error. But as the observed proportion gets closer to 0.5, there is greater variation in the results and a greater margin of error.
Finally, the margin of error also depends on the population size. The smaller the population, or in this case, total number of jelly beans, the smaller the sample necessary to attain a given margin of error, though not proportionately. For instance, with a population of 10 million, a sample of 1,000 would result in a confidence interval of +/- 3.1%. If the population was 100,000, a sample of 990 would result in the same confidence interval of +/- 3.1%.