Article Text


Air samples versus biomarkers for epidemiology
  1. Y S Lin1,
  2. L L Kupper2,
  3. S M Rappaport1
  1. 1Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina, Chapel Hill, North Carolina, USA
  2. 2Department of Biostatistics, School of Public Health, University of North Carolina, Chapel Hill, North Carolina, USA
  1. Correspondence to:
 Prof. S M Rappaport
 CB# 7431, University of North Carolina, Chapel Hill, NC 27599-7431, USA;


Background: It has been speculated on theoretical grounds that biomarkers are superior surrogates for chemical exposures to air samples in epidemiology studies.

Methods and Results: Biomarkers were classified according to their position in the exposure-disease continuum—that is, parent compound, reactive intermediate, stable metabolite, macromolecular adduct, or measure of cellular damage. Because airborne exposures and these different biomarkers are time series that vary within and between persons in a population, they are all prone to measurement error effects when used as surrogates for true chemical exposures. It was shown that the attenuation bias in the estimated slope characterising a log exposure-log disease relation should decrease as the within- to between-person variance ratio of a given set of air or biomarker measurements decreases. To gauge the magnitudes of these variance ratios, a database of 12 077 repeated observations was constructed from 127 datasets, including air and biological measurements from either occupational or environmental settings. The within- and between-person variance components (in log scale, after controlling for fixed effects of time) and the corresponding variance ratios for each set of air and biomarker measurements were estimated. It was shown that estimated variance ratios of biomarkers decreased in the order short term (residence time ⩽2 days) > intermediate term (2 days < residence time ⩽2 months) > long term biomarkers (residence time >2 months). Overall, biomarkers had smaller variance ratios than air measurements, particularly in environmental settings. This suggests that a typical biomarker would provide a less biasing surrogate for exposure than would a typical air measurement.

Conclusion: Epidemiologists are encouraged to consider the magnitudes of variance ratios, along with other factors related to practicality and cost, in choosing among candidate surrogate measures of exposure.

  • environmental monitoring
  • biomarkers
  • air measurements
  • variance components
  • epidemiology
  • attenuation

Statistics from

A major goal of occupational and environmental epidemiology is to establish quantitative relationships between exposures to toxic chemicals and the associated risks of disease. Most studies have considered airborne exposures, where inhalation was the primary route of entry of the contaminant into the body. Investigators collected air samples to estimate concentrations inhaled by members of occupational groups or by the general population. Steady technological advances over the last 50 years have made it possible to collect large numbers of air measurements and thereby to reduce uncertainties in quantifying levels of exposure. Although the anticipated gains in sample size have not materialised,1,2 the technology currently exists to conduct longitudinal studies of health effects from chemical exposures.

Biological monitoring has been increasingly viewed as a desirable alternative to air sampling for characterising occupational and environmental exposures. (Here we use the term “environmental exposures” to refer to chemical exposures in indoor and outdoor settings not associated with workplaces.) This technique utilises biological specimens, especially breath, urine, and blood, to quantify levels of contaminants or their products in the body.3,4 Biological monitoring is theoretically desirable because it accounts for all possible exposure routes (for example, inhalation, ingestion, and dermal contact), it covers unexpected or accidental exposures, and it reflects interindividual differences in uptake or genetic susceptibility.5–8

The endpoint of biological monitoring is often referred to as a biomarker, defined by the US National Research Council (NRC) as “… a change induced by a contaminant in the biochemical or cellular components of a process, structure or function that can be measured in a biological system”.9 The NRC divided biomarkers into three categories, namely, biomarkers of exposure, of effect, and of susceptibility. Examples of biomarkers of exposure include volatile organic compounds in breath, heavy metals in blood or urine, urinary metabolites of organic compounds, and adducts of genotoxic chemicals with haemoglobin or albumin. Biomarkers of effect represent early preclinical changes thought to be related to health risk or damage. Examples include DNA adducts of genotoxic chemicals, specific gene mutations such as hypoxanthine-guanine phosphoribosyltransferase (HPRT), changes in serum proteins indicative of altered metabolism or function, and cytogenetic changes in peripheral lymphocytes, including chromosome aberrations and sister chromatid exchanges (SCEs). Finally, biomarkers of susceptibility relate to an individual’s inherited or acquired ability to respond to a hazardous substance. Single nucleotide polymorphisms (SNPs) of important phase-I enzymes (generally bioactivating enzymes such as cytochrome P450) and phase-II enzymes (generally deactivating enzymes, such as glutathione-S-transferases or epoxide hydrolases) are often regarded as biomarkers of susceptibility.10

The relationship between exposure to a toxic substance and the many possible biomarkers is shown in fig 1, using a genotoxic carcinogen to illustrate the functional elements.11–13 Processes leading to chronic diseases other than cancer can be described by similar schemes. The input to the model is {Xij}, representing the time series of n discrete exposures to the carcinogen (j = 1, 2, …, n) (each averaged over time unit, Δt), received by the ith person in a population. The subsequent time series {Pij}, {Rij}, {Mij}, {(RY)ij} and {Dij} represent the corresponding time series of biomarkers (to be defined).

Figure 1

 Kinetic processes relating exposure to a carcinogen with various biomarkers in an exposed population. Each process is represented by a time series of levels (in brackets) observed in the ith person after the jth time interval. {Xij} is the series of exposures, {Pij} is the series of the parent compound in the body, {Rij} is the series of a reactive carcinogen, {Mij} is the series of a stable metabolite, {(RY)ij} is the series of a DNA adduct, and {Dij} is the series of damaged cells; k0i – k8i represent rate constants for the various processes.

The chemical must first be absorbed into the body via inhalation at rate k0i (designated as the uptake rate for the ith person). In some instances, the substance is intrinsically electrophilic and capable of reacting with DNA and proteins. However, most cancer causing chemicals must first be metabolically activated to electrophiles. Thus, in fig 1, {Pij} refers to the levels of the parent compound, while {Rij} represents levels of the electrophile. The relative amounts of Pij and Rij, at any time, depend on competing rates of passive elimination of the parent compound (designated k1i, including excretion in breath and urine), of metabolic bioactivation of Pij to Rij (designated k2i), and of detoxification of Rij (designated k3i) giving rise to a stable metabolite Mij, excreted at rate k8i. A fraction of Rij reacts at rate k4i with DNA to produce a DNA adduct (RY)ij, where Y represents a DNA base. In fig 1, molecular damage is represented as the series of adduct levels {(RY)ij}, at different times. Most cells contain repair systems that remove DNA adducts and thereby protect the tissue from long term damage. Thus, the amount of (RY)ij depends on the relative rates of DNA adduction (that is, k4i) and repair (designated k5i). Cellular damage is represented by the series representing damaged cells {Dij}, which depends on the rates of cell damage (given by k6i), and repair and/or cell turnover (at rate k7i). The magnitude of an individual’s risk of cancer ultimately relates to the integration of Dij over time, relative to some period of latency, and to his or her susceptibility as determined by genetic, physiological, metabolic, and lifestyle factors. Note that the rate constants k0i – k8i are assumed to be constant for the ith individual but to vary across the population.

In the context of the NRC’s definitions of biomarkers, {Pij}, {Rij} and {Mij} would be biomarkers of exposure, while {(RY)ij} and {Dij} would be biomarkers of effect. Note that biomarkers of susceptibility would be measures of the variability of some rate constants (particularly k2i – k7i) across the population (analogous to effect modifiers).

Progressing from left to right in fig 1, each successive biomarker resides closer to the ultimate disease endpoint and theoretically becomes a more relevant measure of exposure for an epidemiological study than does the series of air levels {Xij}. But, on the other hand, biological specimens can be more difficult to obtain and analyse than air measurements; and the particular time series of biomarker levels can be highly autocorrelated when the sampling time interval is shorter than the residence time of a biomarker, thereby adding complexity to the collection and interpretation of data. Also, in moving to the RY and D compartments, biomarkers become increasingly non-specific and subject to confounding by other agents. For example, N2-ethenoguanine, a DNA adduct, can be produced either by exposure to ethylene or ethylene oxide, or by endogenous processes;14 likewise, chromosome aberrations can arise from a plethora of chemical agents as well as from ionising radiation and reactive oxygen species.15,16

Although the above classification offers certain insights into the potential roles of biomarkers in the exposure-disease continuum, it does not differentiate with regard to the magnitudes of the key kinetic parameters that affect the variation in biomarker levels over time—that is, the rate constants k1i, k3i, k5i, k7i, and k8i, each with units of time−1. For example, the day-to-day fluctuations in levels of the parent compound {Pij} would be much greater for a volatile organic compound (large k1i) than for a heavy metal (small k1i). To consider the role that the persistence of the biomarker might play on its utility as a surrogate for exposure, it is useful to determine the in vivo residence time of each biomarker in the relevant compartment (that is, 1/k1i, 1/k3i, 1/k5i, 1/k7i, or 1/k8i in fig 1). Here we arbitrarily assign biomarkers into three categories, namely, short term biomarkers with residence time ⩽2 days, intermediate term biomarkers with 2 days < residence time ⩽2 months, and long term biomarkers with residence time >2 months. Under this classification scheme, short term biomarkers persist over time scales of one day to one week, intermediate term biomarkers over weeks to months, and long term biomarkers over months to years.

Aside from theoretical and practical considerations regarding the choice of air samples or biomarkers as surrogate measures of exposure, both air and biomarker concentrations vary within and between persons, thus giving rise to measurement error effects that can bias the estimation of exposure-response relationships. Indeed, the magnitudes of attenuation bias can differ among a given set of candidate biomarkers derived from the same chemical due to differences in residence time, specificity, etc. Thus, it is an open question whether a particular biomarker would be a more or less biasing surrogate for exposure than the corresponding air measurements.

The purpose of the present study is to consider the biasing potential of air and biomarker measurements in terms of attenuation in the estimated slope of a hypothetical log exposure-log response relationship. As will be shown, the biasing potential of each measure—that is, {Xij}, {Pij}, {Rij}, {Mij} {(RY)ij}, or {Dij} in fig 1, relates to its within- and between-person variance components. Thus, we compile data from occupational and environmental studies that obtained repeated measurements of both air and biological levels from representative persons. Then, we estimate the within- and between-person variance components of air and biological measurements for each study population, after controlling (when necessary) for particular fixed effects of time. Next, we compare these estimated variance components for air samples and for biomarkers classified by residence time (short term, intermediate term, and long term). Finally, we consider the biasing potential of each surrogate measure for estimating a hypothetical exposure-response relationship and comment on strategies for assessing exposures in epidemiological studies.


Compilation of the database

The database was compiled from published and unpublished longitudinal studies involving air measurements and/or biomarkers; these studies are summarised in Appendices A and B, for environmental and occupational populations, respectively (see OEMwebsite: Because studies were dissimilar in terms of numbers of subjects and numbers of measurements per subject, only studies having at least five subjects with at least two repeated measurements per subject were included in the database, and subjects with single measurements were excluded. Particular attention was paid to data from studies containing repeated measurements of both air levels and biomarkers in a given population. However, due to the paucity of longitudinal studies involving biomarkers, we included some additional sets of biomarker data even when there were no corresponding air measurements. Because inhalation was the primary route of exposure in most studies, personal-air or breathing-zone samples were used. Since we were interested in exposures obtained during normal circumstances, data collected in response to an accidental release or other unusual exposure event were excluded. All data were expressed in the same concentration units as in the original studies. For reference, we point to the compilations of air measurements by Kromhout and colleagues17 and biomarker measurements by Symanski and Greeson.18 All human data obtained from non-published sources had been obtained with subjects’ informed consent under protocols approved by the University of California, Berkeley and the University of North Carolina.

Estimation of variance components

Between- and within-person variance components were estimated using mixed-effects linear models, after natural logarithmic transformation of the air or biomarker measurements to achieve approximate normality and homogeneity of variances, and after adjustment for particular time effects. The following model was used:

Embedded Image

for the jth of ni observations on the ith subject (i = 1, 2, …, k; j = 1, 2, …, ni; ni⩾2), where Xij is the (air or biomarker) concentration, Yij is the natural logarithm of Xij, γ0 is the intercept, Embedded Imagerepresents the fixed time effect (that is, for season, weekday, or linear trend), βi is the random effect for the ith person, and εij is the random-error effect for the jth observation on the ith person. Here, βi ∼ N(0, Embedded Image), εij∼ N(0, Embedded Image), {βi} independent of {εij}, and Cov(εij, εij’) = Embedded Imageρj’ for all jj’. The variances Embedded Imageand Embedded Imagerepresent, respectively, the between- and within-person variance components.

The estimates of Embedded Imageand Embedded Image(designated as Embedded Imageand Embedded Image), are compiled in Appendices C and D, for environmental and occupational settings, respectively (see OEMwebsite: Following Rappaport,12 fold-ranges of variation between- and within-persons were also estimated, for illustration purposes, as the ratio of the 97.5th centile to the 2.5th centile of the appropriate lognormal distribution (Xij for air measurements and Pij, Rij, (RY)ij, or Dij for biomarker measurements); that is, Embedded Imageand Embedded Imagedenote the estimated between-person fold-range and the estimated within-person fold-range, respectively.

Covariance structures

Compound symmetry (CS) was adopted as the default covariance-matrix structure to estimate Embedded Imageand Embedded Imageunder Model (1) using restricted maximum likelihood (REML). Under CS, it is assumed that the subjects are independent of one another and that the correlation between the jth and j′th observations on the ith subject equals Embedded Image(the intraclass correlation). However, in some situations, it was anticipated that the correlation between measurements from the same person would decrease as the number of time intervals between observations increased. Such an autocorrelation structure would be appropriate when Δt is shorter than the residence time of the biomarker, as might be common for intermediate and long term biomarkers. To identify datasets containing significant autocorrelation, an exponential (EXP) covariance structure was also considered, based on the biomarker residence time, the intervals between measurements, and the average number of repeated observations per person (ni⩾3). For an EXP covariance structure, ρjj′ = exp (−φ|tjtj’|) for all jj’, where tj and tj’ are the times (the same for all subjects) at which the jth and j′th measurements were taken. When measurements are taken at the same equally space times for all subjects, then EXP simplifies to the first-order autoregressive AR (1) covariance structure, where Embedded Imagewith Embedded Image. Akaike’s information criteria (AIC) and Schwarz’s Bayesian information criterion (BIC) were used to compare CS and EXP under Model (1) to choose an appropriate covariance structure; CS was chosen unless both AIC and BIC were smaller for EXP than for CS. Appendices C and D list all situations in which EXP was used, rather than CS, to estimate variance components (see OEMwebsite:

Fixed time effects

Given a database consisting of studies ranging in duration from days to years, it was not uncommon to observe situations where average exposure levels changed systematically over time. We considered three types of time effects via Embedded Imagein Model (1), namely, seasonal effects (studies of at least 6 months), weekday effects (studies of less than 6 months), and linear trends (all studies). Time effects were identified graphically using scatter plots of the raw data, and were then confirmed statistically via likelihood ratio tests comparing Model (1) with and without the Embedded Imagecomponent. If no significant time effects were found, variance components were estimated after removing Embedded Imagefrom Model (1). To avoid overfitting the model, only a single time effect was used.

If not explicitly specified in Model (1), a missing fixed time effect would tend to exert its biasing influence by increasing the estimate of Embedded Imageand reducing the estimate of Embedded Image.19 To gauge the magnitude of such biases on estimation of variance components, whenever a significant time effect was observed, Model (1) was applied to the dataset with and without Embedded Imageand the estimates of Embedded Imageand Embedded Imagewere compared.

Bias in estimating exposure-disease relationships

We designate the ratio of Embedded Imageto Embedded Imageas the variance ratio Embedded Image. This variance ratio (λ) can be used to evaluate attenuation bias when estimating an exposure-disease relationship, given that either air or biomarker levels are used as surrogates for actual exposure levels.6,20,21 Consider the simple situation where the underlying relationship between the logarithm of the true mean exposure for the ith person (based on air or biomarker levels) and the logarithm of the expected value of a continuous health outcome is a straight line with slope θtrue (see Appendix E, true regression model; OEMwebsite: Suppose a sample of persons is randomly selected from the population, each subject having n randomly collected measures of exposure. If the average of these n logged exposure measurements is used as a surrogate for the true logged mean exposure level for the ith person (see Appendix E, measurement error model), then the slope parameter actually being estimated (namely, θ*) is related to θtrue via the following equation:

Embedded Image

where Embedded Image. From Equation (2), we see that θ* is less than θtrue (that is, there is attenuation), with the magnitude of the attenuation given by the expression on the right side. We have previously considered a simpler case6,20 where ρjj’ = 0 for all j and j’ (under CS), giving Δ = 0 and the well-known expression:22

Embedded Image

In Appendix E, we consider the two special cases, ρjj’ = 0 and Embedded Image, which correspond to the CS and AR(1) covariance structures, respectively. From Equations (2) and (3), we see that (for a fixed n) attenuation increases as the variance ratio λ increases, which suggests (at least for the simple straight-line model on the log scale being considered in Appendix E) that the exposure surrogate with the smallest λ should produce (on average) the least underestimation of θtrue. With this motivation, we use the estimated variance ratio Embedded Imageto compare air measurements with biomarkers for a given study (smaller is better), consistent with our earlier work.6 Here, we denote the estimated λs for air and biological monitoring as Embedded Imageand Embedded Imagerespectively. We also define Embedded Imageas the “lambda ratio”; when Embedded Imageis less than one, there is evidence that the biomarker would be a better surrogate for exposure than air measurements and vice versa.

Statistical methods

In addition to statistical analyses involving Model (1) described above, analysis of variance (ANOVA) or non-parametric Wilcoxon rank-sum tests (if the distributions were skewed) were used to compare variance components between air measurements and biomarkers. We used PROC MIXED for longitudinal analyses with the SAS statistical package version 8.02 (SAS Institute Inc., Cary, NC). The level of significance of all tests was 0.05.


Description of the database

A total of 12 181 repeated observations from 132 data sets were compiled from 22 studies covering a wide range of pollutants (notably metals, organic compounds, and pesticides) in both environmental and occupational settings (Appendices A and B). The data are summarised in table 1, which lists the numbers of air measurements, biomarker measurements, and subjects, as well as the category of each biomarker according to its type (kinetic compartment in fig 1) and residence time. The numbers of biomarkers in our database decreased from P (21), to M (12), to RY (7), to D (3), to R (2). The database was also reasonably populated with biomarkers in all three categories of residence time—that is, short term (21), intermediate term (15), and long term biomarkers (9). For some contaminants, more than one biomarker was measured. After excluding 104 pre-shift observations, the data used for analysis (12 077 observations) included 50 air-exposure data sets (4623 observations) and 77 biomarker data sets (7454 observations).

Table 1

 Descriptive characteristics of the database*

Effects of time on estimation of variance components

Significant effects of time were found in approximately one third (18 of 50) of air monitoring data sets and in approximately half (36 of 77) of biomarker data sets (Appendices C and D). One such effect is illustrated in fig 2A, which shows a seasonal effect in levels of free styrene glycol in blood (μg/ml) observed among reinforced plastics workers during three surveys conducted 3–4 months apart (unpublished data from a study described by Rappaport and colleagues24). When Model (1) was fitted to the data without a fixed seasonal effect, the residuals deviated from the horizontal line representing zero (fig 2B). In contrast, when Model (1) was fitted to the data with a fixed seasonal effect, the residuals varied randomly about zero (fig 2C). The estimated within- and between-person variance components were potentially biased due to fitting Model (1) to the data without a seasonal effect; that is, Embedded Imageincreased from 1.24 to 1.33 (+6.9%) and Embedded Imagedecreased from 0.595 to 0.562 (−5.5%).

Figure 2

 Characteristics of an example biomarker data set (free styrene glycol) (unpublished data, Rappaport and colleagues24). (A) Time profile of free styrene glycol in blood from workers exposed to styrene and styrene oxide. (B) Residuals (with adjustment for seasonal effect) under Model (1). (C) Residuals (without adjustment for seasonal effect) under Model (1). *European format.

Table 2 summarises the contributions of time effects to Embedded Imageand Embedded Imagein all datasets. If an important time effect was wrongly excluded from Model (1), then Embedded Imagetypically increased 18.2% (median value) for air measurements and 25.4% (median value) for biomarker measurements. Conversely, if an important time effect was excluded from Model (1), Embedded Imagetypically decreased by 11.3% (median value) for air measurements and 4.1% (median value) for biomarkers.

Table 2

 Contribution of time effects to the estimated variance components

Alternative covariance structure

Of all the studies in our database, only two produced significantly better fits to Model (1) with an EXP (rather than a CS) covariance structure, namely, DDE and trans-nonachlor in blood,30 both long term biomarkers, and inorganic lead and δ-aminolevulinate in urine,26 both intermediate term biomarkers (see Appendices C and D). This suggests that CS is generally appropriate for applications of Model (1) to air and biomarker measurements.

Between- and within-person variance components

The cumulative distributions of the estimated between- and within-person variance components are shown in fig 3 in terms of the corresponding fold ranges (that is, Embedded Imageand Embedded Image, respectively) for air measurements and biomarkers. The difference between distributions of Embedded Imagefor air measurements (median Embedded Image = 7.4) and for biomarkers (median Embedded Image = 7.7) was not significant (Wilcoxon rank sum test, p = 0.54) (see fig 3A). Within-person variation was much greater than between-person variation for both air measurements (median Embedded Image = 48.9) and biomarkers (median Embedded Image = 17.4) (fig 3B). Also, the distribution of values of Embedded Imagefor biomarkers was significantly smaller than that for air measurements (Wilcoxon rank sum test, p<0.01). We attribute this to the smoothing of exposure variability in the human body, which increases with the residence time of the biomarker.27,28 Indeed, median values of Embedded Imagefor biomarkers decreased in the order: short term (median = 44.6) > intermediate term (median = 3.7) > long term (median = 3.3).

Figure 3

 Cumulative distributions of fold ranges of variability for exposure and biomarker measurements. (A) Estimated fold range () containing 95% of average exposure (and biomarker) encountered by the population. (B) Estimated fold range () containing 95% of average exposure (and biomarker) encountered by a given person.

Environmental exposures varied much more within persons than occupational exposures for both air measurements (environmental: median Embedded Image = 104; occupational: median Embedded Image = 13.7) and biomarkers (environmental: median Embedded Image = 36.6; occupational: median Embedded Image = 7.6). For comparison, table 3 also shows the cumulative distributions of Embedded Imageand Embedded Imageestimated from occupational studies involving air measurements, reported by Kromhout and colleagues,17 and biomarkers, reported by Symanski and Greeson.18 Neither the Embedded Imagedistribution nor the Embedded Imagedistribution was found to significantly differ between our database and those earlier compilations (data not shown). Overall, the databases show that the median value of Embedded Imagewas greater than that of Embedded Imagein a given setting for both air measurements and biomarkers.

Table 3

 Comparison of between- and within-person estimated fold ranges across studies and exposure settings

Bias in estimating exposure-disease relationships.

Potential bias in the estimation of the slope (θtrue) of an assumed straight line log exposure-log disease relationship (see Appendix E) was evaluated by examining the estimated variance ratio, Embedded Image(smaller is better). In general, values of Embedded Imagefor biomarkers (median = 1.04) were significantly smaller than those for air measurements (median = 2.40) (Wilcoxon rank sum test, p = 0.02). From this result, we infer that using a biomarker as a surrogate exposure measure in a typical study would tend to provide a less biased estimate of the slope of a log exposure-log disease linear relationship than would the use of air measurements.

Figure 4 shows the median and interquartile ranges of Embedded Imagefor both air measurements and biomarkers, stratified by exposure setting (fig 4A) and type of agent (fig 4B). Biomarkers produced significantly smaller values of Embedded Imagethan air measurements (Wilcoxon rank sum test) for environmental exposures (p = 0.01) (fig 4A) and for metal exposures (p = 0.03) (fig 4B). However, for pesticide exposures, air measurements produced significantly smaller Embedded Imagevalues than did biomarker measurements (p = 0.04) (fig 4B). Estimates of Embedded Imageare shown in fig 4C for measurements stratified by the residence time of the biomarker. Here, a decreasing trend for Embedded Imagewas observed for biomarkers in the order: short term > intermediate term > long term, consistent with reductions in Embedded Imagenoted previously.

Figure 4

 Median values and interquartile ranges of values for: (A) exposure setting; (B) exposure agent; and (C) biomarker residence time. Error bars represent interquartile ranges. *p<0.05, Wilcoxon rank sum test for median values.

The above comparisons were based on all studies in our database, whether or not parallel measurements of air levels and biomarkers were included in each investigation. To make direct comparisons between air and biomarker measurements in a given study, the estimated lambda ratio, Embedded Image, was investigated. Of the 54 data sets that provided parallel measurements, almost two thirds (62%) had estimated lambda ratios of less than one (median lambda ratio = 0.46), again providing evidence that biomarkers tend to provide less biasing measures of exposure than air measurements. The median and interquartile ranges of estimated lambda ratios were also stratified and compared by exposure setting, type of agent, and biomarker residence time as shown in fig 5. Results here are generally consistent with those from fig 4, with estimated lambda ratios less than one for environmental settings (fig 5A) and for metals, but not for pesticides (fig 5B). However, the estimated lambda ratios increased in the order: intermediate term biomarkers < short term biomarkers < long term biomarkers (fig 5C), which was unanticipated based on the earlier comparisons of Embedded Image(see fig 4C). This could reflect the relatively small numbers of studies with parallel air and biomarker measurements and the fact that the few long term biomarkers represented (n = 6) included several non-specific endpoints, such as HPRT mutations and SCEs, that could have been influenced by smoking, ionising radiation, and other types of exposures.38

Figure 5

 Median estimated lambda ratios and interquartile ranges of estimated lambda ratios for: (A) exposure setting; (B) exposure agent; and (C) biomarker residence time. Error bars represent interquartile ranges, and the dashed line represents a lambda ratio of one. * p<0.05, Wilcoxon rank sum test for median values.


Our findings support the notion that biomarkers can offer a desirable alternative to air sampling for assessing exposures to chemicals. In addition to providing the oft-mentioned theoretical advantages (accounting for all exposure routes and interindividual differences and residing closer to the disease process), biomarkers also tend to have smaller variance ratios (Embedded Image), and, therefore, to be potentially less biasing surrogate measures of exposure than air measurements for studies of health effects. This particular advantage of biomarkers has only been mentioned anecdotally heretofore.6,20

If values of λ are to be considered in designing a health effects study, it is important that Embedded Imageand Embedded Imagebe estimated with minimal bias. For both air and biomarker measurements, we found that time effects and the choice of covariance structure could be important to the characterisation of these variance components. Excluding an important fixed time effect had a greater impact on Embedded Imagethan on Embedded Imageas observed for other types of longitudinal data.39 This would tend to increase values of Embedded Imagefor the candidate exposure measures, making them appear worse than they actually are. Regarding the choice of covariance structure, we found that CS was appropriate for characterising Embedded Imageand Embedded Imagein virtually all cases. However, CS assumes that repeated measurements collected from a given person have the same correlation no matter how far apart they are in time. Thus, investigators should be aware of potential problems arising from the timing of biomarker measurements relative to the residence time, particularly for intermediate and long term biomarkers, or should use an EXP covariance structure.

We found that values of Embedded Imagetended to be larger for environmental exposures than for occupational exposures regarding both air and biomarker measurements (fig 2B). This indicates that members of the general public experience greater ranges of pollutant levels in their everyday lives than do workers in a given factory and job (as noted in Rappaport and Kupper21), and may explain why biomarkers had consistently smaller lambda ratios in environmental studies than in occupational studies (fig 5A). Thus, biomonitoring may be more advantageous in environmental settings than in occupational settings.

Among biomarkers, we noticed a decreasing trend for Embedded Imagein the order: short term > intermediate term > long term due to the likely smoothing of exposure variability related to slow elimination of the biomarker (fig 4C). This suggests that biomarkers with longer residence times would be preferred to those with shorter residence times. That is, smaller numbers of biomarker measurements per subject would be needed to help control for attenuation bias in an exposure-response relationship (see Appendix E). For example, to estimate the slope of the linear relationship between logged inorganic lead exposure and some logged continuous health outcome, with a bias no larger than 0.10, we use the data of Cope and colleagues26 to estimate n. From equation (2), the number of measurements per subject required for a given bias Embedded Imagecan be determined as the smallest positive integer n satisfying the inequality:

Embedded Image

where Embedded Image. The minimum sample size was estimated by substituting b = 0.10 into equation (4), along with the estimates of λ (16.6 for inorganic lead exposure, and 0.654 for urinary lead, Appendix D) and ρ (zero for air lead and 0.25 for urinary lead under AR(1)) for the corresponding true parameters. This leads to estimated sample sizes of n = 10 measurements per subject for lead in urine and n = 150 measurements per subject for lead in air. There should be little doubt in this case that the biomarker would provide a better surrogate measure of exposure than air measurements for investigating health effects in this population, as has previously been argued in the context of hazard control.40

The advantage noted above for intermediate and long term biomarkers (relative to air measurements) will not generally be realised for short term biomarkers, which reflect exposure during the current or preceding day. In the study by Rappaport and colleagues,26 for example, styrene in exhaled air (a short term biomarker) was measured along with styrene in air. Using data from that investigation, Embedded Imagewas 0.28 for air styrene and 0.99 for styrene in exhaled air (Appendix D), and Δ was 0 in both cases. To achieve the desired goal of b⩽0.1, from equation (4) we require n = 3 measurements per person for air styrene and n = 9 measurements of styrene in exhaled air. In this case, there is evidence that styrene in air would be a better surrogate measure of exposure than styrene in exhaled air.

Although the above calculations indicate that biomarkers with longer residence times would generally be preferred to those with shorter residence times, the specificity of candidate biomarkers and the precision of assays can also be important. Consider for example, the eight biomarkers listed in Appendix D for styrene and styrene oxide. Values of Embedded Imagefor these biomarkers increased in the following order: blood styrene (0.770, two studies) < breath styrene (0.989) < urinary mandelic acid (1.44, two studies) < lymphocyte SCEs (1.58) < lymphocyte HPRT mutation frequency (1.77) < blood styrene glycol (2.09) < blood SO-albumin adduct (14.3) < SO-DNA adduct (24.7). The two smallest values of Embedded Imagewere observed for short term biomarkers (styrene in blood and breath) while the two largest values of Embedded Imagewere for intermediate term biomarkers (albumin and DNA adducts of SO). This can partially be explained by the imprecision of the adduct assays; indeed, the coefficient of variation of the post-labelling assay for the DNA adduct was about 200%.23

Our analyses did not permit inferences to be made about the effects on variance ratios of important metabolising and repair genes. However, it is reasonable to expect that functional SNPs of these gene alleles would increase Embedded Imageof relevant biomarkers while having little affect on Embedded Image. Since air levels should be independent of SNP status, the practical effect of functional SNPs would be to preferentially decrease variance ratios for biomarkers relative to air levels. This would also reduce the biasing effect of such biomarkers as surrogates for exposure.

Aside from the biasing potential of using air and biomarker measurements as surrogates for true exposure levels, other constraints could loom large, such as the difficulty of repeatedly collecting blood specimens rather than air samples from a population or the increased costs of biological measurements compared to air measurements. Also, our analyses implicitly assume that air represents the dominant route of exposure to the toxic chemical. This will not always be the case. For example, we found that air measurements of pesticides produced significantly smaller Embedded Imagevalues than did biomarker measurements (fig 4B), suggesting that ingestion and/or dermal contact were reflected by biomarkers in those studies. Taking all factors into account, the optimal measure of exposure for an epidemiology study depends not only on variance ratios of the air and biomarker measurements (smaller is better), but also on projected sample sizes (larger is better), based on practical considerations and costs, and on knowledge of the dominant route of exposure (if multiple routes, biomarkers are better).

Finally, it is worth mentioning that studies that collect both air measurements and biomarkers are particularly valuable because they provide information with which to estimate the rates of human uptake, elimination, and metabolism of toxic chemicals. Given the paucity of human toxicokinetic data for most contaminants, the quantification of such rates with primary data from observational studies would be valuable. When collected in longitudinal sampling designs, where repeated exposure and air measurements are obtained from representative persons, these data can also allow interindividual differences in uptake, etc to be estimated and ultimately related to genetic, physiological, and lifestyle factors (for example, see Rappaport and colleagues41).

Limitations of the study

Our analyses were limited in several important ways. First, we were constrained by the relatively few studies that provided longitudinal data of both air measurements and biomarkers from a given population and by the limited numbers of measurements per subject in most investigations. Small sample sizes particularly limited our ability to draw clear conclusions in stratified comparisons (for example, for biomarkers of pesticides and long term biomarkers). Second, since most of our database was derived from secondary data, it was only possible to examine the effects of a relatively small number of covariates, such as occupational or environmental sources of exposure, etc. Third, we focused entirely on exposures to airborne contaminants, recognising that other routes (dermal contact or ingestion) could have produced significant contributions to biomarker levels in some cases. Fourth, we considered biasing measurement error effects only in the context of individual based studies where air measurements or biomarkers were obtained from each person in a sample and the logged continuous health outcome was related to the logged individual mean measure of exposure. The statistical issues in such an individual based study are somewhat different from those in a group based study, where the mean health outcome for each group is compared with the corresponding group mean of the exposure measure.42 And finally, we recognise that our database was confined largely to published investigations of biological monitoring. These studies could well have been biased in favour of biomarkers that had previously been shown to be useful, such as metals in blood and urine. If this were the case, then we somewhat overstate the generally less biasing advantage of biomarkers that we observed.

Main messages

  • There is considerable within- and between-person variability in both air and biomarker levels that affect valid and precise characterisation of exposure-disease relationships.

  • Although air measurements and biomarkers each have distinct advantages and disadvantages as surrogate measures of exposure, biomarkers appear to provide less biased estimates of true exposure levels for epidemiological studies.

Policy implications

  • Epidemiologists should consider using biomarkers instead of, or in addition to, air measurements for assessing levels of chemical exposure.


We identified consistently great variability in air levels and biomarkers both between and within persons in a large number of longitudinal studies of chemical exposure. We argue that the air or biological measure with the smallest (within- to between-person) variance ratio should be the optimal—that is, least biasing surrogate for exposure in a study of health effects. We present evidence that biomarkers tend to have smaller variance ratios than air measurements. Epidemiologists should consider the magnitudes of variance ratios of air measurements and biomarkers as one criterion for selecting the optimal surrogate for exposure in their studies.


The authors appreciate the assistance of Dr Ai-Ko Liu and Mr Sungkyoon Kim for technical support. All authors declare that they have no competing interests regarding publication of material contained in this paper.


View Abstract
  • The appendices are available as a downloadable PDF (printer friendly file).

    If you do not have Adobe Reader installed on your computer,
    you can download this free-of-charge, please Click here


    Files in this Data Supplement:


  • Funding: this work was supported by contract MTH0311 from the American Chemistry Council and by Center Grant P30ES10126 from the National Institute of Environmental Health Sciences

  • Competing interests: none

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.