Impact of pesticide exposure misclassification on estimates of relative risks in the Agricultural Health Study
- Aaron Blair1,2,
- Kent Thomas3,
- Joseph Coble4,
- Dale P Sandler5,
- Cynthia J Hines6,
- Charles F Lynch7,
- Charles Knott8,
- Mark P Purdue1,
- Shelia Hoar Zahm1,
- Michael C R Alavanja1,
- Mustafa Dosemeci1,
- Freya Kamel5,
- Jane A Hoppin5,
- Laura Beane Freeman1,
- Jay H Lubin1
- 1Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, USA
- 2National Cancer Institute, Bethesda, Maryland, USA
- 3National Exposure Research Laboratory, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA
- 4Annapolis, Maryland, USA
- 5Epidemiology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
- 6Division of Surveillance, Hazard Evaluation, and Field Studies, National Institute for Occupational Safety and Health, Cincinnati, Ohio, USA
- 7Department of Epidemiology, University of Iowa, Iowa City, Iowa, USA
- 8Centers for Public Health Research and Evaluation, Battelle, Inc., Research Triangle Park, North Carolina, USA
- Correspondence to Aaron Blair, National Cancer Institute, Executive Plaza South, Room 8008, Bethesda, MD 20892, USA;
- Accepted 16 December 2010
- Published Online First 21 January 2011
Background The Agricultural Health Study (AHS) is a prospective study of licensed pesticide applicators and their spouses in Iowa and North Carolina. We evaluate the impact of occupational pesticide exposure misclassification on relative risks using data from the cohort and the AHS Pesticide Exposure Study (AHS/PES).
Methods We assessed the impact of exposure misclassification on relative risks using the range of correlation coefficients observed between measured post-application urinary levels of 2,4-dichlorophenoxyacetic acid (2,4-D) and a chlorpyrifos metabolite and exposure estimates based on an algorithm from 83 AHS pesticide applications.
Results Correlations between urinary levels of 2,4-D and a chlorpyrifos metabolite and algorithm estimated intensity scores were about 0.4 for 2,4-D (n=64), 0.8 for liquid chlorpyrifos (n=4) and 0.6 for granular chlorpyrifos (n=12). Correlations of urinary levels with kilograms of active ingredient used, duration of application, or number of acres treated were lower and ranged from −0.36 to 0.19. These findings indicate that a priori expert-derived algorithm scores were more closely related to measured urinary levels than individual exposure determinants evaluated here. Estimates of potential bias in relative risks based on the correlations from the AHS/PES indicate that non-differential misclassification of exposure using the algorithm would bias estimates towards the null, but less than that from individual exposure determinants.
Conclusions Although correlations between algorithm scores and urinary levels were quite good (ie, correlations between 0.4 and 0.8), exposure misclassification would still bias relative risk estimates in the AHS towards the null and diminish study power.
What this paper adds
The theoretical effects of exposure misclassification are well understood, but information on the actual impact on relative risks is limited.
Correlations between pesticide urinary levels in the AHS and individual exposure determinants are lower than those with exposure estimates based on an expert-derived intensity algorithm.
Pesticide exposure misclassification in the AHS would bias estimates of relative risks toward the null.
This information is critical for interpretation of AHS findings and provides an indication of the likely effects of pesticide misclassification in other investigations.
Exposure misclassification can limit the validity and precision of epidemiological studies and diminish power to detect associations. The theory and mechanics of misclassification are well described1–3 and the impact of exposure misclassification on relative risk estimates can be large.4 5 In the Agricultural Health Study (AHS), as in many epidemiological studies, there is no ‘gold standard’ for exposure. In these cases, it is useful to relate estimates of exposure to actual measurements of current exposures (even if only at a single point in time) to provide an indication of the degree of exposure misclassification associated with surrogate indicators for exposures. Information from such methodological efforts is of considerable assistance in the interpretation of epidemiological data.
The AHS is a long-term, prospective cohort study of licensed pesticide applicators and their spouses in Iowa and North Carolina.6 The purpose of this paper is to use information from the AHS Pesticide Exposure Study (AHS/PES),7 which compares urinary levels of pesticides with exposure estimates based on an expert-derived algorithm8 and with several individual exposure determinants (kilograms of active ingredient used, hours of mixing and application, and number of acres treated) to evaluate the effects of exposure misclassification on estimates of relative risks in the AHS.
Information on pesticide use and application procedures in the AHS was obtained by self-administered questionnaires (available at http://www.aghealth.org/questionnaires.html). Questionnaire information obtained at enrolment on pesticide use included pesticides used, application methods, mixing and applying, proportion of time personally mixed pesticides, first year of use, number of years and days per year personally applied, application method and use of protective equipment. Information obtained on specific pesticides included ever used, mixing and application method, years used, average days per year of use and first year of use. Monitoring information from the literature and from the Pesticide Handlers Exposure Database was used to develop weights for important a priori exposure determinants identified from the literature, including mixing, application method, repair of application equipment and use of personal protective equipment.8 These weights were applied to information on pesticide use practices from AHS questionnaires to create quantitative pesticide exposure intensity scores. These scores were multiplied by the lifetime days of specific pesticide use to create intensity-weighted exposure metrics that have been used in a number of epidemiological papers on various outcomes from this cohort (the AHS bibliography is available at http://www.aghealth.org/).
Details of the AHS/PES monitoring effort and algorithm assessment study are provided elsewhere.7 9 Briefly, the AHS/PES participants were individuals who had completed the AHS 5-year follow-up interview between 1998 and 2003, had reported use of 2,4-dichlorophenoxyacetic acid (2,4-D) or chlorpyrifos, resided in selected counties in Iowa and North Carolina, and indicated they intended to use a product containing 2,4-D or chlorpyrifos during the upcoming season. Urine spot samples and 24 h accumulations were collected prior to, during and after an application of the target pesticides and analysed for levels of 2,4-D and 3,5,6-trichloro-2-pyridinol (TCP; a metabolite of chlorpyrifos). These pesticides were selected for the assessment study because they are important agricultural chemicals worldwide, are used by many AHS participants with several different application methods, and may impact human health.10 11 The AHS/PES participants provided information on application practices at the time of application and, in addition, the AHS/PES monitoring team recorded application practices. Both sources of information and individual exposure determinants were used to create exposure intensity scores using the previously developed algorithm,8 and each score was compared to post-application urinary levels of 2,4-D and the chlorpyrifos metabolite (TCP) using Spearman correlation coefficients. Spearman rank order correlation values were calculated because the urinary biomarker measurements were not normally distributed and because a linear relationship between biomarker measurement and exposure intensity scores could not be assumed. In addition, the algorithm scores are not fully continuous because the algorithm variable weighting factors are combined in certain discrete combinations. The pesticide exposure section of the AHS/PES questionnaire mimicked that from the 5-year follow-up questionnaire administered to the full cohort and included questions on determinants used in the algorithm.8 Urinary concentrations have also been compared with several individual determinants (K Thomas, personal communication, 2009).
We assessed the impact of exposure misclassification on relative risks from the range of correlation coefficients (0.20, 0.40 and 0.70) observed between measured urinary levels of 2,4-D and chlorpyrifos and the algorithm scores, or individual exposure determinants. We considered nine scenarios based on proportions of applicators in the AHS reporting use of various pesticides (ie, 20%, 40% and 70%), a range of sensitivities that are possible with correlation coefficients of 0.20, 0.40 and 0.70, and on the range of relative risks that have been observed in the AHS and often seen in epidemiological investigations (0.5, 1.0, 2.0 and 3.0). The calculations for relative risk attenuation based on these parameters are described in the supplementary online appendix. This study was approved by the National Institutes of Health Special Studies Institutional Review Board, protocol number OH93-NC-N013, and also by Institutional Review Boards at the University of Iowa, Westat, Inc., RTI International and Battelle, Inc. Informed consent was obtained from all participants prior to enrolment.
Urinary biomarker measurement results have been previously reported for 2,4-D and chlorpyrifos applicators in the AHS/PES.7 9 Geometric mean (geometric standard deviation) values in post-application urine samples were 25 (4.1) μg/l for 2,4-D applicators and 11 (2.3) μg/l TCP for chlorpyrifos. There was considerable range among the post-application measurements (>600-fold for 2,4-D applicators (1.6–970 μg/l) and greater than 30-fold for chlorpyrifos applicators (2.5–80 μg/l)). Post-application geometric mean TCP levels for chlorpyrifos applicators were over seven times higher than geometric mean levels in the U.S. adult general population in the 2001–2002 period.12 Geometric mean values for 2,4-D in the U.S. general population are not available due to the preponderance of non-detect values, but post-application geometric mean 2,4-D levels for 2,4-D applicators were about 20 times greater than the 95th percentile level in the U.S. adult general population.12 Exposure intensity algorithm scores based on questionnaires were 10.3±4.6 (range 1.8–20) for 2,4-D applicators and 9.4±2.6 (range 6.6–14) for chlorpyrifos applicators.9
Spearman correlations between post-application urinary levels of 2,4-D and chlorpyrifos metabolites and estimated exposure intensity scores based on monitoring team observations of AHS/PES participant activities were 0.39 for 2,4-D, 0.80 for liquid chlorpyrifos and 0.60 for granular chlorpyrifos (table 1).9 Results were similar using exposure intensity scores based on information from participant-completed questionnaires with correlations of 0.42 for 2,4-D, 0.80 for liquid chlorpyrifos and 0.58 for granular chlorpyrifos. Table 2 provides Spearman correlations between urinary levels of 2,4-D or chlorpyrifos metabolite among study participants and individual determinants of pesticide exposure used in some epidemiological studies, for example, kilograms of active ingredient, hours spent mixing and applying, and number of acres treated (K Thomas, personal communication, 2009). These correlation coefficients were quite low and none was statistically significant. The correlations for 2,4-D were all less than 0.1 and those for chlorpyrifos were 0.19 for kilogram of active ingredient, −0.28 for hour of use per day and −0.36 for acres treated.
Figure 1 shows the impact of exposure misclassification on relative risks considering the correlation between urinary levels and exposure estimates noted above and relative risks in a range relevant to the published results from the AHS. Correlations between estimated exposure intensity scores and urinary levels of 0.2 or less (dotted lines) and sensitivities of 0.9 or less would depress the relative risks considerably. Some lines do not provide information across the full range of possible sensitivities because they are undefined for certain combinations of prevalence of use, sensitivity, specificity and correlation combinations. Many relative risks are so close to the null value that a reasonable interpretation would be that no association exists. For correlations of 0.4 (dashed lines), observed relative risks for the different sensitivity and exposure misclassification categories are somewhat closer to the true relative risks than for correlations of 0.2, but they still show substantial attenuation towards the null for sensitivities of 0.9 or less. Only for correlations of 0.7 (solid lines) do the observed relative risks approach the true relative risks. For true relative risks of 1.0, misclassification described here does not bias the relative risk regardless of the proportion exposed or the magnitude of the exposure misclassification, that is, the estimated relative risk is always 1.0 and non-differential misclassification cannot create a positive association.
Studies have evaluated the reliability and validity of farmers' self-reports of their pesticide application activities.13–15 The reliability of farmers' recall of the types of pesticides used is between 60% and 80% for most pesticides.13 Farmers can also provide considerable detail regarding their application practices, although as the questions get more detailed the reliability decreases.13 Reliable reporting of the fact of pesticide use and application technique does not, however, provide assurance that exposure metrics and, more importantly, dose can be accurately estimated from such questionnaire data. Dose, that is, the concentration at the target tissue, is the ultimate metric of interest in epidemiological studies, but is largely unmeasurable.16 Exposure and biological factors both influence dose. Only one metabolite of chlorpyrifos (TCP) was monitored in the urine in this study and the concentration of other metabolites might also be important for health outcomes, although TCP is the major chlorpyrifos metabolite in humans. Chemical-specific biological factors at the individual level, such as permeability of the skin and other tissues of first contact and metabolism, are important but largely unavailable for epidemiological studies. Some information on exposure factors, such as type and condition of the equipment, use of protective equipment, type of clothing and application rate, can be obtained by interview, but with reporting error. Estimates of pesticide exposure in the AHS were developed from an algorithm that included determinants that appeared, based on the literature, to affect exposure.8 A concern about exposure estimates based on an algorithm is that the error associated with each determinant might multiply to something quite large and unreliable. If this was true, use of a simple, single exposure determinant might be preferable to a more complicated algorithm. Thus, an indication of the magnitude of misclassification from exposure estimates based on an algorithm derived from several determinants versus estimates based on a single determinant, for example, acres treated, hours spent mixing and applying, or amount of active ingredient used, is essential for sound interpretation of data from epidemiological studies and to provide guidance regarding exposure estimation efforts in future studies.17
Data from the recent AHS/PES methodological study found moderate to high correlations (r=0.39 to r=0.80) between measured levels in the urine and algorithm-derived estimates of pesticide exposure intensity based on information from self-reports by study participants or from observations by AHS/PES investigators during the monitoring of pesticide mixing and application activity.9 These correlations between urinary levels and algorithm scores are similar to those reported for 2,4-D, glyphosate and 2-methyl-4-chlorophenoxyacetic acid (MCPA) elsewhere.18–20 It is important to keep in mind that comparison of observational data and monitoring data collected at the time of application does not provide direct information on farmers' ability to recall past use of pesticides, which is critical for examining relationships between chronic diseases and pesticide exposure. Whatever the correlation between urine measurements and a farmer's reporting of specific pesticide activities at the time of monitoring, it is likely that correlation with application activities in the past is weaker because of increased uncertainty that occurs with the passage of time. Inclusion of frequency or duration of use of pesticides in cumulative exposure indices could introduce further misclassification that would typically lead to under-estimates of risk, as has been shown elsewhere.21 On the other hand, it is also possible that recall of the details of pesticide use over many growing seasons might provide a better estimate of cumulative exposure over a long time period than a biological measurement of exposure from a single application, particularly because urinary levels from non-persistent pesticide exposure reflect only recent use and are not necessarily a measure of long-term use.
Several conclusions can be drawn from evaluation of the impact of exposure misclassification on estimated relative risks in the AHS. First, the correlations between questionnaire or observer information on pesticide use, and measured urinary levels are in the range found for other factors that are usually considered to be reliably obtained for epidemiological studies, such as tobacco and alcohol use, diet, physical activity and health assessments.22–27 Second, exposure estimates from an algorithm based on several determinants thought to affect exposure are more highly correlated with measured levels of these pesticides in the urine than some specific individual determinants (ie, kilograms of active ingredient used, hours of mixing and application, or number of acres treated) and would result in less attenuation of relative risks. In fact, in this example the correlations between these individual determinant measures and urinary levels of 2,4-D are so low (<0.1) that even if the true relative risk was 3.0, the calculated relative risk would only be about 1.1, making it very unlikely that any epidemiological study could detect an association. The correlations between these individual determinants and urinary levels of chlorpyrifos are somewhat larger (−0.36 to 0.19) than for 2,4-D (−0.09 to 0.09), but they are still considerably less than found for exposure intensity estimates based on the algorithm.8 Third, the stronger correlations between urinary levels and algorithm exposure scores (eg, 0.4 or 0.5) would still result in considerable attenuation of observed relative risks. For example, if the correlation between algorithm exposure intensity scores and measured urinary levels was 0.4 and the true relative risk was 3.0, the observed relative risks would be between 1.3 and 1.9 when sensitivity is in the 60–80% range. For a true relative risk of 2.0, the observed relative risks from correlations of 0.2 or 0.4 never rise above 1.4. For true relative risks of 0.5, correlations from 0.2 to 0.4 between exposure estimates and measurements yield estimates of relative risk between 0.7 and 0.9. All of these observed relative risks are in a range where a reasonable interpretation would be that no important association exists. In the AHS/PES exposure studies, only evaluation of chlorpyrifos in the liquid formulation had a correlation of 0.7 or greater and this may be inaccurate because the sample size was very small. The attenuation of relative risks from exposure misclassification would also reduce study power, which would necessitate larger investigations to meet study objectives.
There are additional considerations in assessing the accuracy of estimates of exposure intensities used in epidemiological studies. First, for many chronic diseases, it is generally assumed that the critical exposure window occurs many years in the past. The correlations between estimates of exposure intensity and urinary levels in the AHS/PES7 9 are based on simultaneous collection of information on exposure determinants by questionnaire or observation and measurement of urinary levels of pesticides. Estimates of exposure intensity based on self-reported activities that occurred years in the past would probably be subject to greater error. Second, the correlations between algorithm scores and urinary levels varied by pesticide in each of the three recent methodological studies9 18–20 and the range was quite large, that is, from r=0.12 to r=0.80. Third, the impact of misclassification on estimates of relative risks is influenced by the proportion of individuals exposed because this affects the sensitivity and specificity levels. For the range of exposure misclassification noted here, it appears that the proportion of the population exposed was less important than the accuracy of the exposure assessment. This conclusion, however, is based on relatively thin data and a more complete evaluation of this issue is needed.
Some cautions about these findings are warranted. The AHS/PES monitoring study provides information on farmer owner/operators and may not be relevant for other pesticide applicators. The number of measurements on chlorpyrifos is quite small and estimates are relatively unstable. The differences between the urinary levels and individual determinants and algorithm scores we observed need further evaluation to see if they are generalisable to other situations. However, these data provide useful evidence regarding the reliability of the exposure metrics used in the AHS and for the interpretation of AHS findings.
We draw several conclusions from our methodological work in the AHS. First, the accuracy of reporting of pesticide use by farmers is comparable to that for many other factors commonly assessed by questionnaire for epidemiological studies.22–27 Second, except in situations where exposure estimation is quite accurate (ie, correlations of 0.70 or greater with true exposure) and true relative risks are 3.0 or more, pesticide misclassification may diminish risks estimates to such an extent that no association is obvious, which indicates false negative findings might be common. Third, it appears that an algorithm that incorporates several exposure determinants into an estimate of exposure intensity predicts urinary levels better than the individual exposure determinants considered here and would result in less attenuation of relative risk estimates. This provides some confirmation of the assumption that use of algorithms will improve exposure assessment. Finally, we note that even with the reduction in power from exposure misclassification, the AHS has identified some statistically significant links between various agricultural exposures and health outcomes.28–34
We thank the participants of the AHS for their contribution to this research.
Mention of trade names or commercial products does not constitute endorsement or recommendation for use. This article has been subjected to Agency administrative review and approved for publication.
The findings and conclusions in this report are those of the author(s) and do not necessarily represent the views of the National Cancer Institute, National Institute of Environmental Health Sciences, U.S. Environmental Protection Agency or National Institute for Occupational Safety and Health.
Funding This research was partially supported by the Intramural Research Program of the NIH (Division of Cancer Epidemiology and Genetics, National Cancer Institute (Z01CP010119) and the National Institute of Environmental Health Sciences (Z01-ES049030-1)). This work has been funded in part by the U.S. Environmental Protection Agency under Contracts 68-D99-011 and 68-D99-012, and through Interagency Agreement DW-75-93912801-0.
Competing interests None.
Ethics approval This study was approved by the National Institutes of Health Special Studies Institutional Review Board, protocol number OH93-NC-N013, and also by Institutional Review Boards at the University of Iowa, Westat, Inc., RTI International and Battelle, Inc.
Provenance and peer review Not commissioned; externally peer reviewed.