Summary
As non-response rates continue to increase every year, there is growing concern about the representativeness of surveys. Andrew Mercer, Senior Research Methodologist at the Pew Research Center, studied the topic and shared his learnings and identified best practices to increase representativeness:
- No sampling method or data collection process guarantees representativeness.
- It’s more helpful to think in terms of models. A model is just the assumptions you’re making about how your sample relates to the larger population. This can be complex statistical procedures, but also may involve weighting or quotas.
- Doing nothing also relies on a model, but it’s a model that assumes there are no differences between the sample and the population.
- The key question is whether or not modeling assumptions are justified.
- Assumptions should be based on the identified and measured variables needed to predict: either the probability that someone in the population would take the survey or the outcome variable that you are trying to study; how those variables are distributed in the population; and not missing unique segments of the population. These apply to random sampling, weighting, quotas, sample matching, Bayesian hierarchical models, and every other way to generalize from sample to population.
Key takeaways:
- Think through your modeling assumptions before collecting data, particularly identifying confounding variables.
- Complex statistical procedures will only fix your problems if you have the necessary variables and knowledge of the population and research subject.
- Methods that are broadly applicable are valuable, but no methods are universal.
- Polling errors are inevitable. Use them to improve future research.