Smoothing in occupational cohort studies: an illustration based on penalised splines
- 1Occupational Health Program, Harvard School of Public Health, Boston; Department of Work Environment, School of Health and Environment, University of Massachusetts Lowell, Lowell, MA, USA
- 2Department of Work Environment, University of Massachusetts, Lowell, Lowell, MA, USA
- 3Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
- 4Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- 5Department of Occupational and Environmental Health Sciences, University of Washington School of Public Health, Seattle, WA, USA
- Correspondence to: Prof. E A Eisen Occupational Health Program, Harvard School of Public Health, 665 Huntington Avenue, Boston, MA 02115, USA; eeisenhsph.harvard.edu
- Accepted 28 April 2004
Abstract
Aims: To illustrate the contribution of smoothing methods to modelling exposure-response data, Cox models with penalised splines were used to reanalyse lung cancer risk in a cohort of workers exposed to silica in California’s diatomaceous earth industry. To encourage application of this approach, computer code is provided.
Methods: Relying on graphic plots of hazard ratios as smooth functions of exposure, the sensitivity of the curve to amount of smoothing, length of the exposure lag, and the influence of the highest exposures was evaluated. Trimming and data transformations were used to down-weight influential observations.
Results: The estimated hazard ratio increased steeply with cumulative silica exposure before flattening and then declining over the sparser regions of exposure. The curve was sensitive to changes in degrees of freedom, but insensitive to the number or location of knots. As the length of lag increased, so did the maximum hazard ratio, but the shape was similar. Deleting the two highest exposed subjects eliminated the top half of the range and allowed the hazard ratio to continue to rise. The shape of the splines suggested a parametric model with log hazard as a linear function of log transformed exposure would fit well.
Conclusions: This flexible statistical approach reduces the dependence on a priori assumptions, while pointing to a suitable parametric model if one exists. In the absence of an appropriate parametric form, however, splines can provide exposure-response information useful for aetiological research and public health intervention.
- P-splines, penalised splines
- HR, hazard ratio
- RR, relative risk
- GAM, generalised additive model
- df, degrees of freedom
- AIC, Akaike’s Information Criterion
Footnotes
-
Supported by Grant CA81345-03 from National Cancer Institute









