
This is Part I in a series about sample size. Part II extends the concepts discussed below. To read Part II, please click here.
For research professionals in behavioral fields like UXR, consumer behavior, and social psychology, sample size is often the most expensive line item in the budget. Participants exchange their time and energy for your research needs, and the price tag for this exchange adds up quickly. Survey research in particular requires many data points to draw definitive conclusions, so it does not pay to cut corners with the research budget when planning a survey study. But how do we know how many participants is enough?
Peeking under the hood
Many of the sample size calculators available online are handy (here is one, and here is another). You feed the calculator some inputs, it spits out a minimum sample size, and – voila! – you’ve done your due diligence. The research budget is now justified to stakeholders.
Except, further diligence may be due if you’re asked to explain how the sample size calculator arrived at its answer. Like driving a car without understanding how an engine works, it’s easy to calculate sample size without a full grasp of the online calculator’s mathematical guts. Let’s change that.
If we take a peek “underneath the hood” of a sample size calculator, you may find, as I did, that understanding the mechanics of sample size builds your strengths as a communicator. Better communication with stakeholders justifies not only the integrity of the research but also your seat at the table.
This post will review three important inputs to sample size: margins of error, confidence levels, and population variability. Each is like the “lever” of the sample size calculator. These levers are interconnected, resulting in two dependencies, discussed below, that are key to unlocking the sample size algorithm.
Margins of error
No sample perfectly estimates the population average. The margin of error is a metric of precision, defining how narrowly the sample approximates the population’s “true value.” The smaller the margin of error, the more precise the sample’s estimates. For example, a study reporting margins of error of +/- 2% has more narrow boundaries, and therefore more precise estimates, than a study reporting margins of error +/- 5%.
A survey reporting a narrow margin of error is like a powerful magnifying glass, zooming in on an increasingly precise sample estimate. Added precision provides clarity for decision-makers in determining an appropriate course of action from your results.
Population variability
Population variability refers to a population’s heterogeneity or diversity. The more diverse the population, the more likely there are extreme or outlying cases not well represented by the population’s average. In other words, the more heterogeneous the population, the more responses within a sample will spread widely around the average.
Populations with high variability require larger samples than populations with low variability. Gardens provide a nice analogy. A garden that grows one type of flower is like a population with low variability. You can pick a few bunches and each bunch will resemble the one before it. It will take only a few bunches to summarize the garden’s average color. A garden with high variability grows many kinds of flowers, and you will probably have to pick many bunches before you can determine the garden’s average color.
Dependency #1: High variability means wide margins of error
Population variability and margin of error are naturally linked. All else being equal, as population variability increases, so does the margin of error. Quite literally, there is more diversity to account for. When population variability is low, the margin of error shrinks, because samples from these populations will contain fewer deviations from the average.
Visualize this relationship in the garden analogy as the number of flowers within each bunch that are of average color. In a garden with low variability, the margin of error is low, so there will be fewer flowers of extreme or unique colors. In a garden with high variability, the margin of error is high, and each bunch will contain flowers of many colors, some average and some far from average.
Confidence levels
Confidence levels account for the fact that sampling is inherently probabilistic: There is always a risk of seeing something that’s not there or of missing something that is. Confidence levels therefore answer the question: What is the probability that your study produces “true” results and not “false positive” results?
Confidence levels are set in advance, based on how much risk we are willing to take. Setting confidence levels to 90% means accepting the risk that studies identical to ours capture the true value of the population nine times out of 10.
In the garden analogy, this would mean that one out of 10 bunches will contain zero flowers of average color. If this happened to be the bunch that you picked, you would end up making false inferences about the garden. If you cannot tolerate being wrong 10% of the time, then maybe you set confidence levels to 95%, allowing the possibility that the average color exists within 19 bunches out of 20. Setting confidence levels higher reduces the risk of false positives even more.
Dependency #2: High confidence means wide margins of error
Confidence levels and margins of error are directly related. All else being equal, the tradeoff for having higher confidence is accepting greater uncertainty about the precise “location” of the population’s true average value.
This can be counterintuitive. High confidence levels are good, and low margins of error are good, so why should setting high confidence levels lead to something bad?
Theoretically, imagine having 100% confidence in a sample. This would mean the margins of error must encompass every response you collected. We wouldn’t know “where” the true average is, but we would have 100% confidence that it’s somewhere in the sample. By this logic, we could as easily claim that a few outliers reflect the average as any of the many responses clustered near the center. Outliers are by definition rare, so this can’t be true.
So we must give up some confidence - reducing to let’s say 99% - in exchange for the ability to exclude the edge cases. We are now 99% certain we’ve captured the true population average, by narrowing the margin of error.
This process could continue. If it did, the range of values within which we claim the true average lies would continue to shrink. But we would make this claim with increasingly less confidence than if the margins of error were wider.
“Wait, I thought this post was about sample size!”
Margins of error, population variability, and confidence levels are interconnected. It’s not just the “levers” themselves that affect sample size, it’s the interconnections between the levers. A foundational understanding of these dependencies is central to understanding the dynamics of sample size estimation.
With this foundation in hand, please see Part II of this blog, which will discuss how sample size materializes from these inputs and their dependencies.
To learn more about how Key Lime Interactive can improve customer experiences at every stage—from discovery to development to validation, contact us.
Comments
Add Comment