Article Text
Abstract
Background: It has been speculated on theoretical grounds that biomarkers are superior surrogates for chemical exposures to air samples in epidemiology studies.
Methods and Results: Biomarkers were classified according to their position in the exposuredisease continuum—that is, parent compound, reactive intermediate, stable metabolite, macromolecular adduct, or measure of cellular damage. Because airborne exposures and these different biomarkers are time series that vary within and between persons in a population, they are all prone to measurement error effects when used as surrogates for true chemical exposures. It was shown that the attenuation bias in the estimated slope characterising a log exposurelog disease relation should decrease as the within to betweenperson variance ratio of a given set of air or biomarker measurements decreases. To gauge the magnitudes of these variance ratios, a database of 12 077 repeated observations was constructed from 127 datasets, including air and biological measurements from either occupational or environmental settings. The within and betweenperson variance components (in log scale, after controlling for fixed effects of time) and the corresponding variance ratios for each set of air and biomarker measurements were estimated. It was shown that estimated variance ratios of biomarkers decreased in the order short term (residence time ⩽2 days) > intermediate term (2 days < residence time ⩽2 months) > long term biomarkers (residence time >2 months). Overall, biomarkers had smaller variance ratios than air measurements, particularly in environmental settings. This suggests that a typical biomarker would provide a less biasing surrogate for exposure than would a typical air measurement.
Conclusion: Epidemiologists are encouraged to consider the magnitudes of variance ratios, along with other factors related to practicality and cost, in choosing among candidate surrogate measures of exposure.
 environmental monitoring
 biomarkers
 air measurements
 variance components
 epidemiology
 attenuation
Statistics from Altmetric.com
A major goal of occupational and environmental epidemiology is to establish quantitative relationships between exposures to toxic chemicals and the associated risks of disease. Most studies have considered airborne exposures, where inhalation was the primary route of entry of the contaminant into the body. Investigators collected air samples to estimate concentrations inhaled by members of occupational groups or by the general population. Steady technological advances over the last 50 years have made it possible to collect large numbers of air measurements and thereby to reduce uncertainties in quantifying levels of exposure. Although the anticipated gains in sample size have not materialised,^{1,}^{2} the technology currently exists to conduct longitudinal studies of health effects from chemical exposures.
Biological monitoring has been increasingly viewed as a desirable alternative to air sampling for characterising occupational and environmental exposures. (Here we use the term “environmental exposures” to refer to chemical exposures in indoor and outdoor settings not associated with workplaces.) This technique utilises biological specimens, especially breath, urine, and blood, to quantify levels of contaminants or their products in the body.^{3,}^{4} Biological monitoring is theoretically desirable because it accounts for all possible exposure routes (for example, inhalation, ingestion, and dermal contact), it covers unexpected or accidental exposures, and it reflects interindividual differences in uptake or genetic susceptibility.^{5–}^{8}
The endpoint of biological monitoring is often referred to as a biomarker, defined by the US National Research Council (NRC) as “… a change induced by a contaminant in the biochemical or cellular components of a process, structure or function that can be measured in a biological system”.^{9} The NRC divided biomarkers into three categories, namely, biomarkers of exposure, of effect, and of susceptibility. Examples of biomarkers of exposure include volatile organic compounds in breath, heavy metals in blood or urine, urinary metabolites of organic compounds, and adducts of genotoxic chemicals with haemoglobin or albumin. Biomarkers of effect represent early preclinical changes thought to be related to health risk or damage. Examples include DNA adducts of genotoxic chemicals, specific gene mutations such as hypoxanthineguanine phosphoribosyltransferase (HPRT), changes in serum proteins indicative of altered metabolism or function, and cytogenetic changes in peripheral lymphocytes, including chromosome aberrations and sister chromatid exchanges (SCEs). Finally, biomarkers of susceptibility relate to an individual’s inherited or acquired ability to respond to a hazardous substance. Single nucleotide polymorphisms (SNPs) of important phaseI enzymes (generally bioactivating enzymes such as cytochrome P450) and phaseII enzymes (generally deactivating enzymes, such as glutathioneStransferases or epoxide hydrolases) are often regarded as biomarkers of susceptibility.^{10}
The relationship between exposure to a toxic substance and the many possible biomarkers is shown in fig 1, using a genotoxic carcinogen to illustrate the functional elements.^{11–}^{13} Processes leading to chronic diseases other than cancer can be described by similar schemes. The input to the model is {X_{ij}}, representing the time series of n discrete exposures to the carcinogen (j = 1, 2, …, n) (each averaged over time unit, Δt), received by the i^{th} person in a population. The subsequent time series {P_{ij}}, {R_{ij}}, {M_{ij}}, {(RY)_{ij}} and {D_{ij}} represent the corresponding time series of biomarkers (to be defined).
The chemical must first be absorbed into the body via inhalation at rate k_{0i} (designated as the uptake rate for the i^{th} person). In some instances, the substance is intrinsically electrophilic and capable of reacting with DNA and proteins. However, most cancer causing chemicals must first be metabolically activated to electrophiles. Thus, in fig 1, {P_{ij}} refers to the levels of the parent compound, while {R_{ij}} represents levels of the electrophile. The relative amounts of P_{ij} and R_{ij}, at any time, depend on competing rates of passive elimination of the parent compound (designated k_{1i}, including excretion in breath and urine), of metabolic bioactivation of P_{ij} to R_{ij} (designated k_{2i}), and of detoxification of R_{ij} (designated k_{3i}) giving rise to a stable metabolite M_{ij}, excreted at rate k_{8i}. A fraction of R_{ij} reacts at rate k_{4i} with DNA to produce a DNA adduct (RY)_{ij}, where Y represents a DNA base. In fig 1, molecular damage is represented as the series of adduct levels {(RY)_{ij}}, at different times. Most cells contain repair systems that remove DNA adducts and thereby protect the tissue from long term damage. Thus, the amount of (RY)_{ij} depends on the relative rates of DNA adduction (that is, k_{4i}) and repair (designated k_{5i}). Cellular damage is represented by the series representing damaged cells {D_{ij}}, which depends on the rates of cell damage (given by k_{6i}), and repair and/or cell turnover (at rate k_{7i}). The magnitude of an individual’s risk of cancer ultimately relates to the integration of D_{ij} over time, relative to some period of latency, and to his or her susceptibility as determined by genetic, physiological, metabolic, and lifestyle factors. Note that the rate constants k_{0i} – k_{8i} are assumed to be constant for the i^{th} individual but to vary across the population.
In the context of the NRC’s definitions of biomarkers, {P_{ij}}, {R_{ij}} and {M_{ij}} would be biomarkers of exposure, while {(RY)_{ij}} and {D_{ij}} would be biomarkers of effect. Note that biomarkers of susceptibility would be measures of the variability of some rate constants (particularly k_{2i} – k_{7i}) across the population (analogous to effect modifiers).
Progressing from left to right in fig 1, each successive biomarker resides closer to the ultimate disease endpoint and theoretically becomes a more relevant measure of exposure for an epidemiological study than does the series of air levels {X_{ij}}. But, on the other hand, biological specimens can be more difficult to obtain and analyse than air measurements; and the particular time series of biomarker levels can be highly autocorrelated when the sampling time interval is shorter than the residence time of a biomarker, thereby adding complexity to the collection and interpretation of data. Also, in moving to the RY and D compartments, biomarkers become increasingly nonspecific and subject to confounding by other agents. For example, N^{2}ethenoguanine, a DNA adduct, can be produced either by exposure to ethylene or ethylene oxide, or by endogenous processes;^{14} likewise, chromosome aberrations can arise from a plethora of chemical agents as well as from ionising radiation and reactive oxygen species.^{15,}^{16}
Although the above classification offers certain insights into the potential roles of biomarkers in the exposuredisease continuum, it does not differentiate with regard to the magnitudes of the key kinetic parameters that affect the variation in biomarker levels over time—that is, the rate constants k_{1i}, k_{3i}, k_{5i}, k_{7i}, and k_{8i}, each with units of time^{−1}. For example, the daytoday fluctuations in levels of the parent compound {P_{ij}} would be much greater for a volatile organic compound (large k_{1i}) than for a heavy metal (small k_{1i}). To consider the role that the persistence of the biomarker might play on its utility as a surrogate for exposure, it is useful to determine the in vivo residence time of each biomarker in the relevant compartment (that is, 1/k_{1i}, 1/k_{3i}, 1/k_{5i}, 1/k_{7i}, or 1/k_{8i} in fig 1). Here we arbitrarily assign biomarkers into three categories, namely, short term biomarkers with residence time ⩽2 days, intermediate term biomarkers with 2 days < residence time ⩽2 months, and long term biomarkers with residence time >2 months. Under this classification scheme, short term biomarkers persist over time scales of one day to one week, intermediate term biomarkers over weeks to months, and long term biomarkers over months to years.
Aside from theoretical and practical considerations regarding the choice of air samples or biomarkers as surrogate measures of exposure, both air and biomarker concentrations vary within and between persons, thus giving rise to measurement error effects that can bias the estimation of exposureresponse relationships. Indeed, the magnitudes of attenuation bias can differ among a given set of candidate biomarkers derived from the same chemical due to differences in residence time, specificity, etc. Thus, it is an open question whether a particular biomarker would be a more or less biasing surrogate for exposure than the corresponding air measurements.
The purpose of the present study is to consider the biasing potential of air and biomarker measurements in terms of attenuation in the estimated slope of a hypothetical log exposurelog response relationship. As will be shown, the biasing potential of each measure—that is, {X_{ij}}, {P_{ij}}, {R_{ij}}, {M_{ij}} {(RY)_{ij}}, or {D_{ij}} in fig 1, relates to its within and betweenperson variance components. Thus, we compile data from occupational and environmental studies that obtained repeated measurements of both air and biological levels from representative persons. Then, we estimate the within and betweenperson variance components of air and biological measurements for each study population, after controlling (when necessary) for particular fixed effects of time. Next, we compare these estimated variance components for air samples and for biomarkers classified by residence time (short term, intermediate term, and long term). Finally, we consider the biasing potential of each surrogate measure for estimating a hypothetical exposureresponse relationship and comment on strategies for assessing exposures in epidemiological studies.
METHODS
Compilation of the database
The database was compiled from published and unpublished longitudinal studies involving air measurements and/or biomarkers; these studies are summarised in Appendices A and B, for environmental and occupational populations, respectively (see OEMwebsite: http://www.occenvmed.com/supplemental). Because studies were dissimilar in terms of numbers of subjects and numbers of measurements per subject, only studies having at least five subjects with at least two repeated measurements per subject were included in the database, and subjects with single measurements were excluded. Particular attention was paid to data from studies containing repeated measurements of both air levels and biomarkers in a given population. However, due to the paucity of longitudinal studies involving biomarkers, we included some additional sets of biomarker data even when there were no corresponding air measurements. Because inhalation was the primary route of exposure in most studies, personalair or breathingzone samples were used. Since we were interested in exposures obtained during normal circumstances, data collected in response to an accidental release or other unusual exposure event were excluded. All data were expressed in the same concentration units as in the original studies. For reference, we point to the compilations of air measurements by Kromhout and colleagues^{17} and biomarker measurements by Symanski and Greeson.^{18} All human data obtained from nonpublished sources had been obtained with subjects’ informed consent under protocols approved by the University of California, Berkeley and the University of North Carolina.
Estimation of variance components
Between and withinperson variance components were estimated using mixedeffects linear models, after natural logarithmic transformation of the air or biomarker measurements to achieve approximate normality and homogeneity of variances, and after adjustment for particular time effects. The following model was used:
for the j^{th} of n_{i} observations on the i^{th} subject (i = 1, 2, …, k; j = 1, 2, …, n_{i}; n_{i}⩾2), where X_{ij} is the (air or biomarker) concentration, Y_{ij} is the natural logarithm of X_{ij}, γ_{0} is the intercept, represents the fixed time effect (that is, for season, weekday, or linear trend), β_{i} is the random effect for the i^{th} person, and ε_{ij} is the randomerror effect for the j^{th} observation on the i^{th} person. Here, β_{i} ∼ N(0, ), ε_{ij}∼ N(0, ), {β_{i}} independent of {ε_{ij}}, and Cov(ε_{ij}, ε_{ij’}) = ρ_{j’} for all j≠j’. The variances and represent, respectively, the between and withinperson variance components.
The estimates of and (designated as and ), are compiled in Appendices C and D, for environmental and occupational settings, respectively (see OEMwebsite: http://www.occenvmed.com/supplemental). Following Rappaport,^{12} foldranges of variation between and withinpersons were also estimated, for illustration purposes, as the ratio of the 97.5^{th} centile to the 2.5^{th} centile of the appropriate lognormal distribution (X_{ij} for air measurements and P_{ij}, R_{ij}, (RY)_{ij}, or D_{ij} for biomarker measurements); that is, and denote the estimated betweenperson foldrange and the estimated withinperson foldrange, respectively.
Covariance structures
Compound symmetry (CS) was adopted as the default covariancematrix structure to estimate and under Model (1) using restricted maximum likelihood (REML). Under CS, it is assumed that the subjects are independent of one another and that the correlation between the j^{th} and j′^{th} observations on the i^{th} subject equals (the intraclass correlation). However, in some situations, it was anticipated that the correlation between measurements from the same person would decrease as the number of time intervals between observations increased. Such an autocorrelation structure would be appropriate when Δt is shorter than the residence time of the biomarker, as might be common for intermediate and long term biomarkers. To identify datasets containing significant autocorrelation, an exponential (EXP) covariance structure was also considered, based on the biomarker residence time, the intervals between measurements, and the average number of repeated observations per person (n_{i}⩾3). For an EXP covariance structure, ρ_{jj′} = exp (−φt_{j}−t_{j’}) for all j≠j’, where t_{j} and t_{j’} are the times (the same for all subjects) at which the j^{th} and j′^{th} measurements were taken. When measurements are taken at the same equally space times for all subjects, then EXP simplifies to the firstorder autoregressive AR (1) covariance structure, where with . Akaike’s information criteria (AIC) and Schwarz’s Bayesian information criterion (BIC) were used to compare CS and EXP under Model (1) to choose an appropriate covariance structure; CS was chosen unless both AIC and BIC were smaller for EXP than for CS. Appendices C and D list all situations in which EXP was used, rather than CS, to estimate variance components (see OEMwebsite: http://www.occenvmed.com/supplemental).
Fixed time effects
Given a database consisting of studies ranging in duration from days to years, it was not uncommon to observe situations where average exposure levels changed systematically over time. We considered three types of time effects via in Model (1), namely, seasonal effects (studies of at least 6 months), weekday effects (studies of less than 6 months), and linear trends (all studies). Time effects were identified graphically using scatter plots of the raw data, and were then confirmed statistically via likelihood ratio tests comparing Model (1) with and without the component. If no significant time effects were found, variance components were estimated after removing from Model (1). To avoid overfitting the model, only a single time effect was used.
If not explicitly specified in Model (1), a missing fixed time effect would tend to exert its biasing influence by increasing the estimate of and reducing the estimate of .^{19} To gauge the magnitude of such biases on estimation of variance components, whenever a significant time effect was observed, Model (1) was applied to the dataset with and without and the estimates of and were compared.
Bias in estimating exposuredisease relationships
We designate the ratio of to as the variance ratio . This variance ratio (λ) can be used to evaluate attenuation bias when estimating an exposuredisease relationship, given that either air or biomarker levels are used as surrogates for actual exposure levels.^{6,}^{20,}^{21} Consider the simple situation where the underlying relationship between the logarithm of the true mean exposure for the i^{th} person (based on air or biomarker levels) and the logarithm of the expected value of a continuous health outcome is a straight line with slope θ_{true} (see Appendix E, true regression model; OEMwebsite: http://www.occenvmed.com/supplemental). Suppose a sample of persons is randomly selected from the population, each subject having n randomly collected measures of exposure. If the average of these n logged exposure measurements is used as a surrogate for the true logged mean exposure level for the i^{th} person (see Appendix E, measurement error model), then the slope parameter actually being estimated (namely, θ*) is related to θ_{true} via the following equation:
where . From Equation (2), we see that θ* is less than θ_{true} (that is, there is attenuation), with the magnitude of the attenuation given by the expression on the right side. We have previously considered a simpler case^{6,}^{20} where ρ_{jj}’ = 0 for all j and j’ (under CS), giving Δ = 0 and the wellknown expression:^{22}
In Appendix E, we consider the two special cases, ρ_{jj’} = 0 and , which correspond to the CS and AR(1) covariance structures, respectively. From Equations (2) and (3), we see that (for a fixed n) attenuation increases as the variance ratio λ increases, which suggests (at least for the simple straightline model on the log scale being considered in Appendix E) that the exposure surrogate with the smallest λ should produce (on average) the least underestimation of θ_{true}. With this motivation, we use the estimated variance ratio to compare air measurements with biomarkers for a given study (smaller is better), consistent with our earlier work.^{6} Here, we denote the estimated λs for air and biological monitoring as and respectively. We also define as the “lambda ratio”; when is less than one, there is evidence that the biomarker would be a better surrogate for exposure than air measurements and vice versa.
Statistical methods
In addition to statistical analyses involving Model (1) described above, analysis of variance (ANOVA) or nonparametric Wilcoxon ranksum tests (if the distributions were skewed) were used to compare variance components between air measurements and biomarkers. We used PROC MIXED for longitudinal analyses with the SAS statistical package version 8.02 (SAS Institute Inc., Cary, NC). The level of significance of all tests was 0.05.
RESULTS
Description of the database
A total of 12 181 repeated observations from 132 data sets were compiled from 22 studies covering a wide range of pollutants (notably metals, organic compounds, and pesticides) in both environmental and occupational settings (Appendices A and B). The data are summarised in table 1, which lists the numbers of air measurements, biomarker measurements, and subjects, as well as the category of each biomarker according to its type (kinetic compartment in fig 1) and residence time. The numbers of biomarkers in our database decreased from P (21), to M (12), to RY (7), to D (3), to R (2). The database was also reasonably populated with biomarkers in all three categories of residence time—that is, short term (21), intermediate term (15), and long term biomarkers (9). For some contaminants, more than one biomarker was measured. After excluding 104 preshift observations, the data used for analysis (12 077 observations) included 50 airexposure data sets (4623 observations) and 77 biomarker data sets (7454 observations).
Effects of time on estimation of variance components
Significant effects of time were found in approximately one third (18 of 50) of air monitoring data sets and in approximately half (36 of 77) of biomarker data sets (Appendices C and D). One such effect is illustrated in fig 2A, which shows a seasonal effect in levels of free styrene glycol in blood (μg/ml) observed among reinforced plastics workers during three surveys conducted 3–4 months apart (unpublished data from a study described by Rappaport and colleagues^{24}). When Model (1) was fitted to the data without a fixed seasonal effect, the residuals deviated from the horizontal line representing zero (fig 2B). In contrast, when Model (1) was fitted to the data with a fixed seasonal effect, the residuals varied randomly about zero (fig 2C). The estimated within and betweenperson variance components were potentially biased due to fitting Model (1) to the data without a seasonal effect; that is, increased from 1.24 to 1.33 (+6.9%) and decreased from 0.595 to 0.562 (−5.5%).
Table 2 summarises the contributions of time effects to and in all datasets. If an important time effect was wrongly excluded from Model (1), then typically increased 18.2% (median value) for air measurements and 25.4% (median value) for biomarker measurements. Conversely, if an important time effect was excluded from Model (1), typically decreased by 11.3% (median value) for air measurements and 4.1% (median value) for biomarkers.
Alternative covariance structure
Of all the studies in our database, only two produced significantly better fits to Model (1) with an EXP (rather than a CS) covariance structure, namely, DDE and transnonachlor in blood,^{30} both long term biomarkers, and inorganic lead and δaminolevulinate in urine,^{26} both intermediate term biomarkers (see Appendices C and D). This suggests that CS is generally appropriate for applications of Model (1) to air and biomarker measurements.
Between and withinperson variance components
The cumulative distributions of the estimated between and withinperson variance components are shown in fig 3 in terms of the corresponding fold ranges (that is, and , respectively) for air measurements and biomarkers. The difference between distributions of for air measurements (median = 7.4) and for biomarkers (median = 7.7) was not significant (Wilcoxon rank sum test, p = 0.54) (see fig 3A). Withinperson variation was much greater than betweenperson variation for both air measurements (median = 48.9) and biomarkers (median = 17.4) (fig 3B). Also, the distribution of values of for biomarkers was significantly smaller than that for air measurements (Wilcoxon rank sum test, p<0.01). We attribute this to the smoothing of exposure variability in the human body, which increases with the residence time of the biomarker.^{27,}^{28} Indeed, median values of for biomarkers decreased in the order: short term (median = 44.6) > intermediate term (median = 3.7) > long term (median = 3.3).
Environmental exposures varied much more within persons than occupational exposures for both air measurements (environmental: median = 104; occupational: median = 13.7) and biomarkers (environmental: median = 36.6; occupational: median = 7.6). For comparison, table 3 also shows the cumulative distributions of and estimated from occupational studies involving air measurements, reported by Kromhout and colleagues,^{17} and biomarkers, reported by Symanski and Greeson.^{18} Neither the distribution nor the distribution was found to significantly differ between our database and those earlier compilations (data not shown). Overall, the databases show that the median value of was greater than that of in a given setting for both air measurements and biomarkers.
Bias in estimating exposuredisease relationships.
Potential bias in the estimation of the slope (θ_{true}) of an assumed straight line log exposurelog disease relationship (see Appendix E) was evaluated by examining the estimated variance ratio, (smaller is better). In general, values of for biomarkers (median = 1.04) were significantly smaller than those for air measurements (median = 2.40) (Wilcoxon rank sum test, p = 0.02). From this result, we infer that using a biomarker as a surrogate exposure measure in a typical study would tend to provide a less biased estimate of the slope of a log exposurelog disease linear relationship than would the use of air measurements.
Figure 4 shows the median and interquartile ranges of for both air measurements and biomarkers, stratified by exposure setting (fig 4A) and type of agent (fig 4B). Biomarkers produced significantly smaller values of than air measurements (Wilcoxon rank sum test) for environmental exposures (p = 0.01) (fig 4A) and for metal exposures (p = 0.03) (fig 4B). However, for pesticide exposures, air measurements produced significantly smaller values than did biomarker measurements (p = 0.04) (fig 4B). Estimates of are shown in fig 4C for measurements stratified by the residence time of the biomarker. Here, a decreasing trend for was observed for biomarkers in the order: short term > intermediate term > long term, consistent with reductions in noted previously.
The above comparisons were based on all studies in our database, whether or not parallel measurements of air levels and biomarkers were included in each investigation. To make direct comparisons between air and biomarker measurements in a given study, the estimated lambda ratio, , was investigated. Of the 54 data sets that provided parallel measurements, almost two thirds (62%) had estimated lambda ratios of less than one (median lambda ratio = 0.46), again providing evidence that biomarkers tend to provide less biasing measures of exposure than air measurements. The median and interquartile ranges of estimated lambda ratios were also stratified and compared by exposure setting, type of agent, and biomarker residence time as shown in fig 5. Results here are generally consistent with those from fig 4, with estimated lambda ratios less than one for environmental settings (fig 5A) and for metals, but not for pesticides (fig 5B). However, the estimated lambda ratios increased in the order: intermediate term biomarkers < short term biomarkers < long term biomarkers (fig 5C), which was unanticipated based on the earlier comparisons of (see fig 4C). This could reflect the relatively small numbers of studies with parallel air and biomarker measurements and the fact that the few long term biomarkers represented (n = 6) included several nonspecific endpoints, such as HPRT mutations and SCEs, that could have been influenced by smoking, ionising radiation, and other types of exposures.^{38}
DISCUSSION
Our findings support the notion that biomarkers can offer a desirable alternative to air sampling for assessing exposures to chemicals. In addition to providing the oftmentioned theoretical advantages (accounting for all exposure routes and interindividual differences and residing closer to the disease process), biomarkers also tend to have smaller variance ratios (), and, therefore, to be potentially less biasing surrogate measures of exposure than air measurements for studies of health effects. This particular advantage of biomarkers has only been mentioned anecdotally heretofore.^{6,}^{20}
If values of λ are to be considered in designing a health effects study, it is important that and be estimated with minimal bias. For both air and biomarker measurements, we found that time effects and the choice of covariance structure could be important to the characterisation of these variance components. Excluding an important fixed time effect had a greater impact on than on as observed for other types of longitudinal data.^{39} This would tend to increase values of for the candidate exposure measures, making them appear worse than they actually are. Regarding the choice of covariance structure, we found that CS was appropriate for characterising and in virtually all cases. However, CS assumes that repeated measurements collected from a given person have the same correlation no matter how far apart they are in time. Thus, investigators should be aware of potential problems arising from the timing of biomarker measurements relative to the residence time, particularly for intermediate and long term biomarkers, or should use an EXP covariance structure.
We found that values of tended to be larger for environmental exposures than for occupational exposures regarding both air and biomarker measurements (fig 2B). This indicates that members of the general public experience greater ranges of pollutant levels in their everyday lives than do workers in a given factory and job (as noted in Rappaport and Kupper^{21}), and may explain why biomarkers had consistently smaller lambda ratios in environmental studies than in occupational studies (fig 5A). Thus, biomonitoring may be more advantageous in environmental settings than in occupational settings.
Among biomarkers, we noticed a decreasing trend for in the order: short term > intermediate term > long term due to the likely smoothing of exposure variability related to slow elimination of the biomarker (fig 4C). This suggests that biomarkers with longer residence times would be preferred to those with shorter residence times. That is, smaller numbers of biomarker measurements per subject would be needed to help control for attenuation bias in an exposureresponse relationship (see Appendix E). For example, to estimate the slope of the linear relationship between logged inorganic lead exposure and some logged continuous health outcome, with a bias no larger than 0.10, we use the data of Cope and colleagues^{26} to estimate n. From equation (2), the number of measurements per subject required for a given bias can be determined as the smallest positive integer n satisfying the inequality:
where . The minimum sample size was estimated by substituting b = 0.10 into equation (4), along with the estimates of λ (16.6 for inorganic lead exposure, and 0.654 for urinary lead, Appendix D) and ρ (zero for air lead and 0.25 for urinary lead under AR(1)) for the corresponding true parameters. This leads to estimated sample sizes of n = 10 measurements per subject for lead in urine and n = 150 measurements per subject for lead in air. There should be little doubt in this case that the biomarker would provide a better surrogate measure of exposure than air measurements for investigating health effects in this population, as has previously been argued in the context of hazard control.^{40}
The advantage noted above for intermediate and long term biomarkers (relative to air measurements) will not generally be realised for short term biomarkers, which reflect exposure during the current or preceding day. In the study by Rappaport and colleagues,^{26} for example, styrene in exhaled air (a short term biomarker) was measured along with styrene in air. Using data from that investigation, was 0.28 for air styrene and 0.99 for styrene in exhaled air (Appendix D), and Δ was 0 in both cases. To achieve the desired goal of b⩽0.1, from equation (4) we require n = 3 measurements per person for air styrene and n = 9 measurements of styrene in exhaled air. In this case, there is evidence that styrene in air would be a better surrogate measure of exposure than styrene in exhaled air.
Although the above calculations indicate that biomarkers with longer residence times would generally be preferred to those with shorter residence times, the specificity of candidate biomarkers and the precision of assays can also be important. Consider for example, the eight biomarkers listed in Appendix D for styrene and styrene oxide. Values of for these biomarkers increased in the following order: blood styrene (0.770, two studies) < breath styrene (0.989) < urinary mandelic acid (1.44, two studies) < lymphocyte SCEs (1.58) < lymphocyte HPRT mutation frequency (1.77) < blood styrene glycol (2.09) < blood SOalbumin adduct (14.3) < SODNA adduct (24.7). The two smallest values of were observed for short term biomarkers (styrene in blood and breath) while the two largest values of were for intermediate term biomarkers (albumin and DNA adducts of SO). This can partially be explained by the imprecision of the adduct assays; indeed, the coefficient of variation of the postlabelling assay for the DNA adduct was about 200%.^{23}
Our analyses did not permit inferences to be made about the effects on variance ratios of important metabolising and repair genes. However, it is reasonable to expect that functional SNPs of these gene alleles would increase of relevant biomarkers while having little affect on . Since air levels should be independent of SNP status, the practical effect of functional SNPs would be to preferentially decrease variance ratios for biomarkers relative to air levels. This would also reduce the biasing effect of such biomarkers as surrogates for exposure.
Aside from the biasing potential of using air and biomarker measurements as surrogates for true exposure levels, other constraints could loom large, such as the difficulty of repeatedly collecting blood specimens rather than air samples from a population or the increased costs of biological measurements compared to air measurements. Also, our analyses implicitly assume that air represents the dominant route of exposure to the toxic chemical. This will not always be the case. For example, we found that air measurements of pesticides produced significantly smaller values than did biomarker measurements (fig 4B), suggesting that ingestion and/or dermal contact were reflected by biomarkers in those studies. Taking all factors into account, the optimal measure of exposure for an epidemiology study depends not only on variance ratios of the air and biomarker measurements (smaller is better), but also on projected sample sizes (larger is better), based on practical considerations and costs, and on knowledge of the dominant route of exposure (if multiple routes, biomarkers are better).
Finally, it is worth mentioning that studies that collect both air measurements and biomarkers are particularly valuable because they provide information with which to estimate the rates of human uptake, elimination, and metabolism of toxic chemicals. Given the paucity of human toxicokinetic data for most contaminants, the quantification of such rates with primary data from observational studies would be valuable. When collected in longitudinal sampling designs, where repeated exposure and air measurements are obtained from representative persons, these data can also allow interindividual differences in uptake, etc to be estimated and ultimately related to genetic, physiological, and lifestyle factors (for example, see Rappaport and colleagues^{41}).
Limitations of the study
Our analyses were limited in several important ways. First, we were constrained by the relatively few studies that provided longitudinal data of both air measurements and biomarkers from a given population and by the limited numbers of measurements per subject in most investigations. Small sample sizes particularly limited our ability to draw clear conclusions in stratified comparisons (for example, for biomarkers of pesticides and long term biomarkers). Second, since most of our database was derived from secondary data, it was only possible to examine the effects of a relatively small number of covariates, such as occupational or environmental sources of exposure, etc. Third, we focused entirely on exposures to airborne contaminants, recognising that other routes (dermal contact or ingestion) could have produced significant contributions to biomarker levels in some cases. Fourth, we considered biasing measurement error effects only in the context of individual based studies where air measurements or biomarkers were obtained from each person in a sample and the logged continuous health outcome was related to the logged individual mean measure of exposure. The statistical issues in such an individual based study are somewhat different from those in a group based study, where the mean health outcome for each group is compared with the corresponding group mean of the exposure measure.^{42} And finally, we recognise that our database was confined largely to published investigations of biological monitoring. These studies could well have been biased in favour of biomarkers that had previously been shown to be useful, such as metals in blood and urine. If this were the case, then we somewhat overstate the generally less biasing advantage of biomarkers that we observed.
Main messages

There is considerable within and betweenperson variability in both air and biomarker levels that affect valid and precise characterisation of exposuredisease relationships.

Although air measurements and biomarkers each have distinct advantages and disadvantages as surrogate measures of exposure, biomarkers appear to provide less biased estimates of true exposure levels for epidemiological studies.
Policy implications

Epidemiologists should consider using biomarkers instead of, or in addition to, air measurements for assessing levels of chemical exposure.
Conclusions
We identified consistently great variability in air levels and biomarkers both between and within persons in a large number of longitudinal studies of chemical exposure. We argue that the air or biological measure with the smallest (within to betweenperson) variance ratio should be the optimal—that is, least biasing surrogate for exposure in a study of health effects. We present evidence that biomarkers tend to have smaller variance ratios than air measurements. Epidemiologists should consider the magnitudes of variance ratios of air measurements and biomarkers as one criterion for selecting the optimal surrogate for exposure in their studies.
Acknowledgments
The authors appreciate the assistance of Dr AiKo Liu and Mr Sungkyoon Kim for technical support. All authors declare that they have no competing interests regarding publication of material contained in this paper.
REFERENCES
Supplementary materials
The appendices are available as a downloadable PDF (printer friendly file).
If you do not have Adobe Reader installed on your computer,
you can download this freeofcharge, please Click hereFiles in this Data Supplement:
 [view PDF]  Appendices.
Footnotes

Funding: this work was supported by contract MTH0311 from the American Chemistry Council and by Center Grant P30ES10126 from the National Institute of Environmental Health Sciences

Competing interests: none