Objectives Iron-ore miners are exposed to extremely dusty and physically arduous work environments. The demanding activities of mining select healthier workers with longer work histories (ie, the Healthy Worker Survivor Effect (HWSE)), and could have a reversing effect on the exposure–response association. The objective of this study was to evaluate an iron-ore mining cohort to determine whether the effect of respirable dust was confounded by the presence of an HWSE.
Methods When an HWSE exists, standard modelling methods, such as Cox regression analysis, produce biased results. We compared results from g-estimation of accelerated failure-time modelling adjusted for HWSE with corresponding unadjusted Cox regression modelling results.
Results For all-cause mortality when adjusting for the HWSE, cumulative exposure from respirable dust was associated with a 6% decrease of life expectancy if exposed ≥15 years, compared with never being exposed. Respirable dust continued to be associated with mortality after censoring outcomes known to be associated with dust when adjusting for the HWSE. In contrast, results based on Cox regression analysis did not support that an association was present.
Conclusions The adjustment for the HWSE made a difference when estimating the risk of mortality from respirable dust. The results of this study, therefore, support the recommendation that standard methods of analysis should be complemented with structural modelling analysis techniques, such as g-estimation of accelerated failure-time modelling, to adjust for the HWSE.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What this paper adds
The heathy worker survivor effect is a common problem in occupational studies that evaluate the effects of exposures, but valid adjustment methods are rarely included in the data analysis.
In occupational studies, access to information on possible confounders that could also be time varying (ie, information correlated with the exposure, termination of employment, and outcome) is necessary for successful adjustment for the healthy worker survivor effect using g-estimation.
Exposure to respirable dust was associated with increased risk of mortality after adjustment for the healthy worker survivor effect, even when outcomes known to be related to dust were censored.
Iron ore mining has traditionally been performed in industrial working environments heavily contaminated with airborne dust and volatile compounds. Studies of a cohort of Swedish iron-ore workers from the Kiruna and Malmberget mines have found the following associations between dust and risk: exposure to quartz from mining is associated with elevated risks of silicosis mortality1 and lung cancer incidence;2 ,3 an association between respirable dust and myocardial infarction mortality;4 ex-iron miners experience persistent effects of chronic bronchitis and chronic productive cough.5 For this cohort, dust exposure has been associated only with the outcomes identified in the above studies.
The phenomenon of less healthy workers terminating employment earlier or transferring to another, possibly less demanding position in the company, can affect the results of the analyses of cumulative exposure effects in occupational cohort studies. This relative early termination of employment could be (or not be) due to the exposure in focus and result in a weakening of the association between a cumulative exposure and outcome. This phenomenon is known as the Healthy Worker Survivor Effect (HWSE).6 HWSE is an issue in iron-ore miner cohorts, because workers with poorer health have greater difficulty coping with the heavy physical strain associated with mining. For example, a previous study of the Kiruna and Malmberget cohort found a negative association between time working underground and several cause-specific outcomes.7
Survival analysis of occupational cohort study data is typically performed using standard rate/hazard models (eg, Cox and Poisson regression models).8 However, standard survival modelling risks producing biased estimates when models include time-varying variables. A variable is a time-varying confounder if past value of the confounder predicts current exposure and the current value of the confounder predicts outcome.9 In an occupational cohort affected by the HWSE, cumulative time employed is a time-varying confounder because cumulative employment time attained before exposure predicts current cumulative exposure and current cumulative employment time also predicts outcome. In addition, if past exposure predicts the current value of the time-varying confounder, then the results of a standard survival analysis will be biased.
Several methods have been suggested for control of the HWSE in occupational studies: restricting the analysis to include only participants remaining alive after a minimum length of time since hire; adjusting for current employment status; accounting for time lags in exposure; and adjusting for time since termination of employment. However, these methods have limitations that lead to other biases or are only effective under special circumstances.6 ,10 G-estimation of accelerated failure-time (AFT) models is one method that can control for time-varying confounding, and that can adjust for the HWSE without the limitations inherent in the other methods.8 ,11
We evaluated the risk of exposure from respirable dust on all mortality and cancer incidence among iron-ore miners. We compared standard survival Cox modelling with structural nested modelling by g-estimation of AFT models to assess if adjusting for the HWSE affected the conclusions. Of particular interest was whether the respirable dust variable explained the elevated risks of mortality and cancer incidence results, after censoring outcomes that were already known to be associated with dust for this cohort.
Miners were included in the study cohort if they were employed for at least 1 year between 1923 and 1998 at the Kiruna, and between1923 and 1996 at the Malmberget mine. The original cohort consisted of 13 623 male workers and has previously been described in detail.2 ,3 Several studies of workers from this cohort have been published between 2008 and 2013.1–4 ,7 ,12
The cohort was followed up for mortality from 1952 to 2006, and for cancer incidence from 1958 to 2007. The data were obtained from the Swedish National Cause of Death Register and the Swedish Cancer Register. Several aggregated groups of mortality outcomes from the cohort were examined: all-cause mortality; all-cause mortality except lung cancer mortality, respiratory disease mortality and violent deaths, with and without myocardial infarction mortality. All cancer and all cancer except lung cancer were the two groups of outcomes from the cohort representing cancer incidence. We used International Classification of Diseases (ICD)-codes to define mortality and cancer incidence outcomes: lung cancer mortality (ICD7: 162); respiratory disease mortality (ICD10: J00–J99); violent deaths (ICD10: V00–Y99); myocardial infarction mortality (ICD10: I20–I24); lung cancer incidence (ICD7: 162). Dates of events were linked with the cohort using the unique Swedish personal identification number.
We excluded participants who either died or emigrated (N=584) before 1952, who lacked information on work location (N=130), or who were employed only as white-collar workers (N=493). Classification of work periods and work locations (outside or inside) was based on each employee's job title and workplace. These data were collected from each company's manual records and computerised registers. Time of emigration from Sweden after 1952 was censored in the statistical analysis (N=54). The final cohort comprised 13 000 employees (8135 from the Kiruna and 5036 from the Malmberget mine) of which 171 had been employed at both mines.
Beginning in 1968, ambient air concentrations of respirable dust and quartz were measured at each mine. This study was based on respirable dust, which is the fraction of dust (particles) that is small enough (aerodynamic diameter ≤5 μm) to pass through the nose into the upper respiratory system, and deep into the bronchioles and alveoli. There were a total of 1981 samples with a mean ambient dust concentration that varied between 0.1 and 24.7 mg/m3. Based on these measurements and on each worker's job title or workplace, the exposures were categorically classified into levels 0, 0.8, 3, 8, 11, 14 and 24 by mine safety engineers and an occupational hygienist. From 1974 onwards, mechanical ventilation in the mines and improved ventilation in the trucks reduced dust levels. The dust levels present before 1968 were similar to the corresponding levels between 1968 and 1973. Detailed descriptions of the dust exposure measurements and the measurement procedures were published in two previous studies.2 ,4 An estimate of a personal cumulative exposure of respirable dust was achieved by multiplying dust in mg/m3 by the number of person-years for each worker (mg/m3×years).
We used Cox regression models with penalised splines to estimate smoothed HRs of cumulative respirable dust and outcome, and the corresponding 95% CIs. The amount of smoothness of the splines was based on three degrees or four degrees of freedom for respiratory disease mortality and for other outcomes, respectively. The models were adjusted for attained age by using age as the time scale for the model. Calendar year, cumulative employment time, year of first employment based on the categories ≤1929, 1930–1939, 1940–1949, 1950–1954, 1955–1959, 1960–1969 and ≥1970, white-collar worker (applies to workers who were employed both as blue-collar worker and white-collar worker), mine location (Kiruna or Malmberget) and employee leave for longer periods (yes or no; registered as employed in the company but not on duty, most likely on sick-leave) were the other covariates included in the model. Each spline curve was calibrated so that the HR was equal to unity at a cumulative exposure value equal to zero. All variables except year of first employment were analysed as time-dependent variables. The Cox regression models were also calculated based on categorised exposure-values (0, >0–24, 25–99, 100–199 and ≥200 mg/m3×years). The first year of employment was excluded from the analysis because inclusion into the cohort required a minimum employment time of 1 year.
Standard methods that adjust for confounding (either based on a model or using stratification) result in biased estimates if the confounding variable is affected by prior exposure. G-estimation of an AFT model is a semiparametric method for parameters estimation in structural nested models. This method was developed to adjust for the HWSE.11 Cumulative employment time represents a time-varying confounder because it is associated with cumulative dust and with possible unknown health-related factors. G-estimation has previously been described in detail.8 ,11 ,13 The method mimics a randomised trial protocol at each follow-up period where the ‘assignment of exposure’ conditional on previous exposure and confounding history is independent of outcome. In practice, a randomly assigned protocol of exposure is not possible in an observational cohort study. G-estimation achieves a pseudorandom relationship between exposure assignment and outcome by adjusting for confounding by modelling the relationship between the exposure and confounding variables. Three studies have used this method to control for the HWSE in occupational cohorts.10 ,14 ,15
The g-estimation calculation estimate the parameter ψ in an AFT model that links each individual's observed survival time with counterfactual survival time:where is the counterfactual survival time if unexposed, ψ is the parameter of interest and A(t) is an indicator of exposure (yes or no) at time t. The counterfactual survival time represents the survival time under the hypothetical scenario of the employee never being exposed. An iterative series of pooled logistic regression models are performed to estimate the true ψ. Using the notation developed by Hernán et al,8 the following pooled logistic model equation was used to estimate ψ: where A was current exposure (yes or no) at time period k, θ0(k) was a cubic spline for time of follow-up, was the exposure at the previous time period, ¯ was a vector of baseline and time-varying covariates, T>u(k) required that the worker had to be alive at period k, Δ(ψ*) was an indicator variable that was equal to one if the event would have been observed under all possible exposure scenarios and equal to zero otherwise (if an event was censored then it was artificially censored). The estimate of the true ψ was the value of ψ for which θ3=0. This estimated ψ was found by repeating the pooled logistic model until ψ for which value of θ3=0 was found. In this setting, dichotomisation of exposure and aggregation of data into discrete periods were compromises necessary to perform the g-estimation. We based the calculation on yearly periods and each period was classified as exposed if the period included an exposure ≥8 mg/m3; otherwise, the period was classified as unexposed. The choice of the cut-off at 8 mg/m3 dust was based on the a priori assumption that the step between 3 and 8 mg/m3 dust represented the most relevant cut-off for respirable dust when accounting for the categories used to define the exposure variable. We handled administrative censoring and censoring due to death from other causes (or emigration) using the method described by Hernán and Witteman.8 ,13
The estimated parameter ψ from g-estimation was interpreted as follows: depending on whether ψ was positive or negative, being always exposed (in this study, exposed to respirable dust ≥8 mg/m3) increased or decreased the expected survival time by a factor of exp(−ψ). Therefore, exp(−ψ) was an estimate of the survival ratio, which was the ratio between the life expectancy if all of the workers in the cohort had been exposed to dust ≥8 mg/m3 continuously through the follow-up period and the life expectancy if none of the workers were ever exposed (represented by <8 mg/m3). The survival ratios were then transformed into conditional survival ratios that represented a more realistic scenario from an occupational cohort in which the cohort population was exposed for a given number of years (in this study, for 15 years). We transformed the survival ratios into conditional HRs to compare the levels of the parameters estimated with g-estimation with the HRs estimated from standard Cox regression models. This transformation was accomplished by performing a standard Cox regression on a data set in which each worker contributed data from two hypothetical scenarios: exposed and unexposed based on the counterfactual survival times. All g-estimation results were based on a scenario that all workers were exposed for only 15 years compared with not exposed (labelled conditional survival ratios or conditional HRs). Conditional survival and HR estimation procedures are presented in an e-appendix prepared by Chevrier et al10
Adjustment for possible selection bias due to loss of follow-up or competing risk was performed by applying inverse-probability weights that were calculated using pooled logistic regression analysis. The model estimating ψ (as described previously), and the pooled logistic model estimating the weights for handling competing risks contained the following information: dust exposure previous year; indicators for working underground previous year; working outside previous year; calendar year; year of first employment; age at baseline; time-varying age; cumulative dust exposure; employment time outside; employment time underground; previously on sick-leave; cumulative time off work (likely to be employed elsewhere or unemployed); and mine location. Follow-up on exposure to respirable dust ended on the last day of employment as a blue-collar worker or at the end of 1973, whichever occurred earlier. After 1973, exposure to respirable dust was considerably lower (in practice, only time periods employed as a blue-collar worker or no longer than up to 1973 were evaluated when estimating ψ).
Ninety-five percent CIs (95% CI) for ψ, conditional survival ratios and conditional HRs were calculated using bootstrap simulation by finding the 2.5th and 97.5th centiles of the bootstrap estimates obtained from the bootstrap replications. For each bootstrap data set, the bootstrap replications were found by performing a grid search for an estimate of ψ. We used the estimating equation described by Hernán, (ie, finding the value of ψ where the weighted residual sum of the difference between the observed and expected exposures converges to zero)8 in the bootstrap algorithm. The bootstrap CIs were based on 600 non-parametric bootstrap replications. The R-code used for performing g-estimation in this study is available as an online supplementary appendix.
Data were analysed using the statistical software package R (V.2.15.2, R Development Core Team, R Foundation for Statistical Computing, Vienna, Austria).
By the end of follow-up (2006), 13 000 workers accounted for 443 930 person-years when the end point was mortality (cancer-incidence: 428 035 person-years). The mean age at first employment was 24 years. Mean employment time for the cohort was 17 years. Results for the distributions of outcomes and their relationships to exposure to dust are presented in table 1. For all workers, 74% were exposed to dust and 37% were exposed to high levels of dust (≥8 mg/m3). The distribution of exposure by year of employment is illustrated in figure 1. The values for the distribution (%) of person-years (as employed) per category of respirable dust exposure in mg/m3 were 0: 48%, 0.8: 16%, 3: 19%, 8: 6%, 11: 9%, 14: 2% and 24: <1%. Additional characteristics of the cohort were published in a previous article.7
Exposure to cumulative dust and Cox regression modelling
The mortality from all causes increased as the levels of respirable dust increased (figure 2). The HR was 1.22 (95% CI 1.09 to 1.37) for iron-ore workers exposed to ≥200 mg/m3×years of cumulative dust, compared with unexposed workers. The association weakened when lung cancer, violent deaths and respiratory mortalities were censored (see online supplementary figure S1). When deaths due to myocardial infarctions were censored, removing this mortality further erased any association between exposure from dust and death (see online supplementary figure S2). Restricting the outcomes to mortality from respiratory diseases resulted in high levels of HR estimates, but they were not statistically significant (see online supplementary figure S3). However, the increase in the HR vanished when the 23 deaths from silicosis were censored (see online supplementary figure S3).
For all cause cancer incidence, workers exposed to high levels of respirable dust were at higher risk of mortality from cancer compared with unexposed workers (figure 3). The HR between workers exposed to ≥200 mg/m3×years of cumulative dust and those unexposed was 1.35 (95% CI 1.11 to 1.64). This association of higher rates with high levels of dust clearly weakened when lung cancer was censored from the outcomes (see online supplementary figure S4).
Exposure to dust and g-estimation of AFT modelling
For all-cause mortality, estimates of risk from respirable dust based on g-estimation resulted in interpretations that were similar to the interpretations of the Cox regression analysis results. A conditional survival ratio of 0.94 (95% CI 0.91 to 0.96; 140 outcomes were artificially censored) for all mortality indicated that the life expectancy decreased by 6% if a worker was exposed to dust ≥8 mg/m3 for 15 years compared with never exposed. This conditional survival ratio corresponded to a conditional HR of 1.29 (95% CI 1.14 to 1.39; figure 2). When lung cancer, violent deaths and respiratory diseases were censored in the analysis, the corresponding estimated conditional survival ratio was 0.95 (95% CI 0.92 to 0.98; 88 outcomes were artificially censored). This result suggested that a dust exposure effect was present. This result corresponded to a conditional HR of 1.23 (95% CI 1.07 to 1.36; see online supplementary figure S1). Censoring myocardial infarctions also weakened the association, but the result continued to indicate that dust could be associated with mortality (see online supplementary figure S2). This conditional survival ratio was 0.97 (95% CI 0.93 to 1.00; 57 outcomes were artificially censored) and the corresponding conditional HR was 1.16 (95% CI 1.00 to 1.34). Support of an association between dust and respiratory disease mortality was weak when g-estimation was used (see online supplementary figure S3). This estimated conditional survival ratio was then 0.97 (95% CI 0.89 to 1.05; 7 outcomes were artificially censored) for a worker exposed to dust for 15 years compared with a never-exposed worker. The corresponding HR was 1.23 (95% CI 0.64 to 2.35).
The g-estimation results indicated that there was only weak support for the finding that respirable dust increased cancer risk (figure 3). The conditional survival ratio was 0.96 (95% CI 0.92 to 1.01; 40 outcomes were artificially censored) and the corresponding conditional HR was 1.17 (95% CI 0.93 to 1.41). The conditional survival ratio for cancer incidence when lung cancer outcomes were censored was 0.99 (95% CI 0.93 to 1.02; 36 outcomes were artificially censored). The corresponding conditional HR was 1.06 (95% CI 0.89 to 1.40; see online supplementary figure S4).
This population-based cohort study revealed that exposure to respirable dust was associated with higher rates of all-cause mortality. This conclusion was supported by the Cox regression and AFT models based on g-estimation (figure 2). This result was expected for all-cause mortality because previous studies based on workers from this cohort also found exposure effects. Quartz dust from mining has been associated with lung cancer as an independent risk factor for lung cancer and has been correlated with radon exposure.2 ,3 Increased mortality from silicosis (a subset of respiratory diseases mortality) due to quartz exposure has also been found in a similar cohort.1 Dust-exposed work could be indirectly associated with elevated rates of violent deaths because previously the working environment was dusty and hazardous. Increased risk of mortality from injuries has been reported from this cohort.12
When lung cancer, respiratory diseases and violent deaths were censored in the analysis, the results based on Cox regression and g-estimation of structural AFT models suggested different conclusions (see online supplementary figure S1). The parameters estimated using g-estimation supported the conclusion that dust was associated with higher risks of mortality, but this conclusion was only weakly supported by the Cox regression model results. The hypothesis that dust exposure causes mortality as the result of myocardial infarctions has been tested in previous studies. One of these studies of the Kiruna/Malmberget cohort found that excess rates of myocardial infarctions are associated with dust exposure.4 In contrast with the Cox regression results, the results from the g-estimation analysis indicated that there was an association between mortality and dust exposure when the 1550 deaths from myocardial infarctions were censored (see online supplementary figure S2). These results suggested that an effect from respirable dust was present after adjustment for the HWSE using g-estimation of AFT models, even after subgroups of mortalities previously known to be related to dust were censored.
The Cox regression results indicated that cumulative respirable dust exposure was associated with higher respiratory disease mortality rates. However, this association was solely due to mortality from silicosis (see online supplementary figure S3). The estimates derived from g-estimation did not suggest that there was an association between dust and respiratory disease mortality. The hypothesis that dust from mining causes mortality as the result of respiratory diseases (other than silicosis) should not be excluded, because the validity of respiratory disease mortality diagnoses recorded in the Swedish Causes of Death Register is questionable.16
For cancer incidence, the analysis based on Cox regression revealed that elevated levels of cancer occurred among highly exposed workers (figure 3). In the study cohort, all cancers consisted of 12% lung cancer outcomes, which is a type of cancer associated with exposure to quartz.2 ,3 Compared with the Cox regression results, the results from the g-estimation of AFT model did not support this stronger association to the same extent. Perhaps the results based on g-estimation did not indicate the presence of an elevated risk because the HWSE did not bias the results as much as it did for the cancer analysis. It is also possible that g-estimation of AFT modelling was an inferior method in this situation because it was less precise and had a lower ability to analyse the quantitative aspects of the exposure. When lung cancer incidence was censored, none of the methods supported the hypothesis that respirable dust was a risk factor for cancer (see online supplementary figure S4).
Our occupational study used a rare analytical approach because it combines standard survival and structural nested modelling. The latter was used to adjust for HWSE with the ultimate aim to estimate a causal effect from exposure. We did not attempt to structure the two approaches to achieve comparable estimates. Our intention was that they should complement each other and use their respective advantages. Even if a HWSE was not present in this cohort, equivalent or even directly comparable results were not guaranteed because of differences between the models. A key difference between the model estimates was that g-estimation of an AFT model generates marginal estimates and standard Cox-modelling generates conditional estimates. In the context of this study, g-estimation estimated the relative change in life expectancy if the whole cohort had been exposed to respirable dust (in this study, 15 years) compared with the unexposed group. The HR obtained from a Cox regression model represented the risk ratio (exposed vs unexposed) conditional on other covariates. The corresponding HR from g-estimation, which was conditional on being limited to a 15 years exposure, was a marginal estimate that should be interpreted as the hazard rate if all workers were exposed for 15 years, relative to the hazard rate if all workers were unexposed. Hernàn has discussed some limitations that affect the estimate of the conventional HR derived from Cox regression that results from its conditional property.17
The critical assumption in g-estimation is that no unmeasured confounding is present in the model. To adjust for the HWSE, measurements related to the exposure and outcome must be available for each worker and for their work-periods. One strength of this study was the amount and type of information that was available for the mine workers (eg, profession, work place, blue-collar or white-collar work and working underground or outdoors) and their work periods (eg, temporary worker, not employed by one of the mines, and sick leave status). This information was available as time-dependent variables based on complete date intervals. It is unrealistic to assume that all information that could possibly cause bias from time-varying confounding was available for this study. A more plausible conclusion should be that the model was adjusted for the HWSE given the information on possible confounding that was actually available and included in the model. The validity of the assumption of no unmeasured confounding would have improved if information on smoking had been available.
The dichotomisation of exposure and the restriction of the cohort to include only work periods up to 1973 meant that we did not take full advantage of the quantitative data when we performed the AFT modelling based on g-estimation. Compared with AFT modelling based on g-estimation, Cox regression modelling provides more ways to handle the exposure information in the model, without restricting the data. The reference groups in the HRs from the Cox models were defined as unexposed to respirable dust; in the g-estimation of AFT models the reference groups were defined as <8 mg/m3 respirable dust. This predefined choice of a cut-off at 8 mg/m3 respirable dust as well as misclassification of exposure could have contributed to a result that indicated that there was no association between dust and cancer incidence. Given that the categories of respirable dust were recommended by professionals with knowledge of the mines, the cut of at 8 mg/m3 seemed to be a natural one, because 8 mg/m3 is high enough to be considered abundant in the environment (mean level of exposure from respirable dust for work-periods below 8 mg/m3 was 0.7 mg/m3; above 8 mg/m3, it was 10.5 mg/m3). Recent papers have presented techniques to handle exposure during g-estimation that do not completely restrict the analysis to dichotomised exposure variables.14 ,15 ,18 We believed that incorporation of these techniques in this study setting (with the accompanying increased analytical complexity) was beyond the scope of this study. We chose to present all estimates derived from g-estimation based on exposure duration restricted to a predefined given number of years (15 years). Of all workers, 43% had been employed for a minimum of 15 years as a blue-collar worker. The CI for the conditional survival ratio as a test of statistical significance (ie, if the CI included unity or not) is not affected by this restriction to 15 years of exposure. Another drawback of g-estimation results from the artificial censoring of participants. That is, depending on the evaluated estimated ψ, some outcomes may not be used. One consequence of artificial censoring is that the estimating function, which may be neither smooth nor monotone, makes valid hypothesis testing and CIs calculations more difficult.18
When evaluating the risk of exposure from respirable dust, using g-estimation of AFT models to adjust for the HWSE affected the interpretations of the results. In contrast with Cox regression modelling, the adjustment for the HWSE using g-estimation of AFT models suggested the presence of an association between cumulative dust and mortality after censoring mortality outcomes already known to be associated with dust. These results on mortality, therefore, provides further support for the efforts that has been made continuously to reduce dust exposure from mining as well as encouraging the reduction of harmful dust exposures in similar environments. However, adjusting for the HWSE did not have the same effect when cancer incidence outcomes were used. This study suggest that more comprehensive conclusions on the possible causal association between exposure and outcome can be made by complementing standard survival modelling with structural nested modelling, such as g-estimation of AFT models.
Contributors OB has been involved in all phases of the study, including study design, literature search, conduct of the study, data analysis and final article write up. All other researchers, listed as authors, have contributed fully to the planning and writing of the study.
Funding This work was supported by the European Union Kolarctic ENPI CBC grand number 02/2011/043/KO303- MineHealth.
Competing interests None.
Ethics approval The Regional Ethical Review Board in Umeå.
Provenance and peer review Not commissioned; externally peer reviewed.