
This is Part II in a series about sample size. Part I introduced the concepts discussed below. To read Part I, please click here.
Part I of this series laid the foundations for understanding sample size as a function of margins of error, population variability, and confidence levels. Researchers can enter these three parameters into online sample size calculators and the calculator will output a recommended minimum sample size. But how does the calculator work? What happens if we take a peek “under the hood” of the sample size calculator?
The primary goal of this post is to apply our understanding of the dependencies between the margin of error, population variability, and confidence levels to the calculation of sample size, in nontechnical, conceptual terms. Three principles guide this calculation.
Principle #1. Researchers need large samples to study diverse populations while keeping margins of error narrow
All else being equal, population variability and margins of error increase proportionally to each other (see Dependency #1 in Part I). In reality, this relationship is relative to sample size. Large samples statistically diffuse, or wash out, the correspondence between population variability and margin of error. Meaning, that as samples grow in size, the margin of error tightens around the sample’s average, no matter how diverse the population.
You can think of this as individual outliers losing their ability to manipulate the sample’s average, as their scores get woven into the fabric of an increasingly large number of observations, and as they are canceled out by other – equally extreme outliers on the opposite end of the spectrum.
The roulette table and casino odds provide a useful analogy. It is no secret that casinos operate with a “house advantage.” But variability in roulette is high, and if you observe a small number of spins, you might see outlying cases of people winning big. Observe for a day, and many hundreds of spins, and you will inevitably find that the margin of error around the “true value” of the casino odds shrinks to a minimum and the casino ultimately emerges victorious, regardless of the influence of outlying or extreme cases earlier in the day. However, it takes a large number of observations to diffuse the effects of variability on the margin of error.
So, if a narrow margin of error is desirable for research, but the population you are studying contains high variability, a large sample size will be a requirement for the research budget.
Principle #2. To increase confidence levels without increasing margins of error, the sample size must increase
Population variability is not something we can control. According to Principle #1 above, if faced with high population variability, a large sample size will keep margins of error narrow and in check.
At the same time, confidence levels and margin of error increase proportionally to each other (see Dependency #2 in Part I). So if high confidence levels are desirable, then margins of errors will naturally widen, offsetting any efforts at mitigating population variability through increased sample size.
Can you see where I’m going with this?
The prescription for keeping margins of error tight while claiming high confidence is to continue increasing sample size, above and beyond what would be needed if we accounted for population variability alone. High confidence levels amplify the need for large samples.
Principle #3: The cost of increasing sample sizes increases exponentially and not linearly
This third and final principle gets down to the brass tacks of the sample size calculator. In plain terms, it means that changes in minimum sample size are more extreme than any corresponding changes to desired margins of error or confidence levels. An example might illustrate this best.
Let’s say you want to run a survey. You know from prior research that population variability is equal to a standard deviation of 0.5 units. You set the margin of error to 5% and the confidence level to 80%. How many participants should you expect to recruit?
According to this calculator, you need a sample size (n) of 164. Great! This should be within budget. But as a researcher, you would be more comfortable with confidence levels set higher than 80% and/or margins of error set lower than 5%. How would the minimum sample size change in response?
We can view minimum sample sizes in the table below (starting with the far-right cell shaded yellow). The table displays minimum sample sizes as functions of margins of error (as columns) and confidence levels (as rows), holding a standard deviation of 0.5 units.
Margins of Error |
||||||
1% |
2% |
3% |
4% |
5% |
||
Confidence Levels |
80% |
4,096 |
1,024 |
456 |
256 |
164 |
85% |
5,184 |
1,296 |
576 |
324 |
208 |
|
90% |
6,807 |
1,702 |
757 |
426 |
273 |
|
95% |
9,604 |
2,401 |
1,068 |
601 |
385 |
|
99% |
16,641 |
4,161 |
1,850 |
1,041 |
666 |
Starting at n = 164, scan vertically down. This shows you the cost of increasing confidence levels but keeping a margin of error at 5%. To jump from 80% confidence (n = 164) to 85% confidence (n = 208), the minimum sample size increases by 44 units. To jump from 85% to 90% (n = 273), it increases by 65 units. And then by 112 units with the next jump. And so on.
If changes in sample size were proportional to changes in confidence levels, we would expect the increase in minimum sample size to be the same magnitude for each jump. But that’s not what we see. For every 5% increase in confidence levels, the change in minimum sample size grows larger than the 5% increase that came before it.
We see the same pattern if we scan across rows within the table. To reduce margins of error at the same level of confidence, the minimum sample size increases exponentially and not linearly.
To find the balance between scientific integrity and resource conservation, the costs of increasing confidence levels or tightening the margin of error must be considered. Both can optimize with a large enough sample, but the numbers quickly escalate into the thousands. At the least, you will need 164 participants. At the most, 16,641!
Understanding the mechanics of sample size improves your skills as a communicator
The principles described here reflect the pragmatic – if also unfortunate – reality that when conducting survey research, samples of the highest scientific integrity need to be exorbitantly large – large enough to be out of reach for all but the most exceptional of research budgets. Compromises must be made for the research to be carried out, and knowledge of what’s “under the hood” of any sample size calculator gives you an edge when making tough but informed decisions.
You may find, as I did, that understanding the mechanics of sample size builds your talents as a communicator at the negotiating table. Sample size is often the most expensive line item in the budget, and decisions to reduce sample size come with tradeoffs. Communicating these tradeoffs to stakeholders is an important step in balancing the reality of the research budget with the need for adequate resources to conduct the research you envision.
To learn more about how Key Lime Interactive can improve customer experiences at every stage—from discovery to development to validation, contact us.
Comments
Add Comment