Your questions our answers
Response rates as quality criteria for survey data
In this section of the newsletter we answer some questions on data collection and data processing procedures that are regularly asked by persons who perform tasks associated with surveys and may work with survey data but are not necessarily trained survey researchers. We hope that your questions and our answers will allow for a better understanding of the scope of the survey and how to use the data in appropriate manner.
The response rate of a survey is generally considered to be a major - or the major - quality criterion for survey data. There are two reasons to be concerned about non-response. First of all the loss of respondents has the effect of reducing the sample size. This reduces the precision of the survey estimates. The second negative effect of non-response is the possible introduction of error into the results.
In general good survey practice includes minimising non-response rates. Most survey practitioners have informal expectations about what constitutes an acceptable response rate what is high and what perhaps is an unacceptable level. In one of the many survey methodology guides we read for example that a response rate of 50% is adequate for analysis and reporting a response rate of 60% is good and a response rate of 70% is very good.
The truth is however that there is no basis for fixing a level or range for non-response rates below which we can say that results are compromised by non-response error and above which they are not.
The truth about non-response bias
The fundamental issue with respect to non-response is not just the amount of non-response but the selectivity of non-response which is the difference between respondents and non-respondents.
Respondent Mean = Full Sample Mean + (Non-Response Rate) x (Respondent Mean - Non-Respondent Mean)
- Respondent mean: estimate based on the responses of the respondents
- Non-Respondent mean: estimate based on the responses of non-respondents that is if they would have participated
- Full sample mean: the 'real' mean of interest based on the responses of respondents and non-respondents
Or in other words:
- Non-response bias = (Non-Response Rate) x (Respondent Mean - Non-Respondent Mean)
This formula shows that the non-response bias is a function of two components the non-response rate (m/n) and the difference between the respondent and the non-respondent means. This means that:
1. Non-response bias is not an inherent property of a survey but is only defined relative to a desired estimate. Many studies of non-response bias find large variability in non-response bias differences within the survey across estimates.
2. The biasing influence of non-response is eliminated under two conditions; either when the non-response rate is zero (there are no non-respondents) or when there are no differences between respondents and non-respondents on the statistic of interest. For any situation in-between there is possible smaller or larger bias. High response rates constrain the likely size of non-response bias but a high response rate does not guarantee there will be no bias.
In conclusion the level at which non-response begins to affect data quality which data are and are not affected by non-response and the nature and significance of possible non-response bias cannot be established for all surveys but needs to be assessed for each survey separately even more so for each estimate of interest in a survey. In the next issue of this newsletter we will look at how a survey practitioner can evaluate the quality of a survey in terms of efforts made to measure and reduce non-response error.

