Article Text

Download PDFPDF

Asbestos fibreyears and lung cancer: a two phase case–control study with expert exposure assessment
  1. H Pohlabeln1,
  2. P Wild2,
  3. W Schill1,
  4. W Ahrens1,
  5. I Jahn1,
  6. U Bolm-Audorff3,
  7. K-H Jöckel4
  1. 1Bremen Institute for Prevention Research and Social Medicine (BIPS), Bremen, Germany
  2. 2French National Research and Safety Institute (INRS), Department of Epidemiology, Nancy, France
  3. 3Labour Inspection, Occupational Health Division, Wiesbaden, Germany
  4. 4Institute for Medical Informatics, Biometry, and Epidemiology, University of Essen, Essen, Germany
  1. Correspondence to:
 Dr H Pohlabeln, Bremen Institute for Prevention Research and Social Medicine, Linzer Str. 8, D-28359 Bremen, Germany;


Aims: To assess the cumulative effect of asbestos on lung cancer risk where the exposure is assessed by an expert rating.

Methods: 1678 male cases and controls were enrolled in a population based matched case–control study, focused on occupational risk factors, carried out in West Germany. The exposure to asbestos was computed as lifelong working hours. For a validation subsample of 164 matched pairs from this study the intensity of asbestos exposure was further assessed by a panel of experts in order to obtain an estimate of the cumulative exposure on a time by intensity scale (fibreyears). The information on duration of asbestos exposure in the original study was combined with the fibreyears following the two phase case control study paradigm.

Results: The number of exposed subjects in the validation subsample was 75 cases and 71 controls. The percentage of subjects with a cumulative exposure ≤1, 1 to ≤10, and >10 fibreyears was 16%, 15%, and 15% for the cases and 18%, 16%, and 9% respectively for the controls. The smoking adjusted odds ratios for the fibreyears based on an unconditional logistic regression were 0.81, 1.02, and 1.60 respectively with increasing exposure categories (not significant). The coefficient (beta) for a log transformed trend was 1.156. Applying the two phase paradigm, these odds ratios became 0.86, 1.33, and 1.94; the latter reached significance and the beta coefficient was 1.178.

Conclusions: The two phase paradigm allowed us to obtain a more precise estimate of the effect of asbestos on lung cancer. Results are consistent with a doubling of the lung cancer risk with 25 fibreyears asbestos exposure.

  • lung cancer
  • asbestos
  • two phase studies

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

A recently published case–control study1 focused on evaluating carcinogens and occupations suspected to cause lung cancer and to generate new hypotheses about occupational risks. A strong point of this study was its thorough control of occupational asbestos exposure which could be quantified in lifelong hours of exposure. This point was further elaborated on a validation subsample from this case–control study by obtaining an expert quantitative exposure assessment in fibres/ml for every job phase, which allowed a computation of the cumulative exposure to asbestos in so called fibreyears. The main objective was to combine the information about asbestos exposure in the original study with the estimated cumulative exposure (fibreyears) of the validation subsample in order to obtain a more precise estimate of the effect of asbestos on lung cancer compared to an analysis of the validation sample alone. We report the exposure assessment in the initial and the validation sample and present three analyses of these data. The first examination is an analysis by duration of exposure which could be obtained for all subjects. The second analysis is a logistic regression of the validation sample alone, focusing on the effect of fibreyears. The last analysis considered the two data sets as a two phase study stratified by disease status and duration of exposure. This allowed a two phase logistic regression analysis, again aiming at the effect of fibreyears, which however makes use of the information from both samples, thus giving more precise estimates.



The original case–control study included 839 male cases and 839 male controls individually matched on age and region from all hospitals in Bremen between 1988 and 1993 and hospitals in Frankfurt/Main (n = 86) between 1989 and March 1990. Cases were eligible if: the diagnosis of lung cancer was histological or cytologically confirmed; the diagnosis occurred less than three months before interview; subjects were born in 1913 or later and of German nationality; subjects were well enough to undergo an interview of 1.5 hours duration; and there was no suspicion of pulmonary metastases from a different primary tumour. The response rate was 69% among cases and 68% among controls (randomly drawn from the mandatory residence registries of a priori selected reference communities). All subjects were interviewed by trained interviewers. A structured questionnaire was used in face to face interviews to obtain information on job history, active and passive smoking, residence, dietary habits, medical history, and basic demographic characteristics (see Jöckel and colleagues1 for details).

Exposure assessment

The assessment of occupational exposure was based on three sources: a detailed job history of all jobs held for at least six months; an exposure checklist for known and suspected carcinogens (among them asbestos), and 33 supplementary questionnaires (SQs). Job titles, industries, and departments were coded according to the standard classifications provided by the Statistisches Bundesamt.2,3 The SQs were used in addition to the customary job history whenever jobtitles (for example, painter, farmer), tasks (for example, insulation), industries (for example, chemical industry), or circumstances (for example, use of asbestos in the company) implied exposure to substances which are potentially carcinogenic. The well known carcinogens of the lung were included (asbestos, arsenic, nickel, etc) as well as substance groups that were suspected to cause lung cancer (welding fumes, cutting fluids, wood dust, etc). For example, one of our a priori hypotheses in the original study was that exposure to welding fumes and gases is associated with an increased lung cancer risk. Therefore, welding for instance, was addressed in one of the 33 supplementary questionnaires (SQ 14: “Welding, Flame Cutting”). Application of the SQs was “job overlapping”—that is, SQ 14 for instance was not only applied to welders but also to study subjects who welded but worked in other occupations (plumbers, mechanics, electricians, etc).

Questions with regard to asbestos exposure were addressed in 19 of these supplementary questionnaires. Participants answered questions for every job phase on duration of exposure in years, days per year, and hours per day. It was therefore possible to calculate for the whole lifespan of each subject the product of (years)×(days/year)×(hours/day) of asbestos exposure. This index is called “lifelong hours of exposure”. The performance of this method with respect to asbestos has been evaluated previously.4, 5

For a validation subsample of 328 subjects consisting of all male cases recruited in the Bremen hospitals (in 1991 and 1992) and their controls, the intensity of occupational exposure to asbestos was further assessed by a panel of industrial hygiene experts. The rating was based on the complete questionnaire information—that is, job history, exposure checklist, and SQs, but the latter were considered most informative by the panel. The panel comprised two industrial hygienists from the Institute and Outpatient Clinic for Occupational and Social Medicine, University Gieβen. For each phase of the subjects' work histories, an exposure level in fibres/ml (f/ml) was estimated, which, when multiplied by the duration of these periods, yielded an estimated cumulative exposure in fibreyears. A fibreyear is defined as working in a full shift (eight hours) for one year at an average dust level of 1 fibre/ml (f/ml), two years at 0.5 f/ml, or any other combination yielding the product of 1. These assessments were based on the raters' own experience in measurement and on the rules that have been established for the judgement of compensation claims of asbestos related lung cancer.6 The rating was done blindly: the experts did not know whether they were rating the questionnaire of a case or a control.

Both raters are very experienced with assessment of the cumulative dose of asbestos exposure, gained from assignment of a comparable asbestos rating in a recently published case–control study of occupational risk factors for diffuse malignant mesothelioma.7 In that study, the agreement between the two experts with regard to the crude exposure status (ever exposed/never exposed) was good: kappa = 0.72 (95% CI: 0.62 to 0.82).

Statistical methods

As the present study focused on asbestos risk, we concentrated our analyses on asbestos exposure variables and smoking adjustment. The exposure index “lifelong hours”, based on the asbestos related SQs, was treated as in the original study1 in four exposure categories: non-exposed to asbestos, and three duration categories representing the tertiles of the distribution among exposed subjects. In the validation subsample, the cumulative exposure to asbestos was also subdivided into four categories: non-exposed to asbestos according to the experts, and three exposure groups of 0 to ≤1, 1 to ≤10, and >10 fibreyears. Alternatively continuous fibreyears and log(fibreyears + 1) were considered as exposure variables. The smoking behaviour was also expressed in four categories: non-smokers, including occasional smokers; and three smoker groups defined by the number of packyears, of which the first of these (≤20 packyears) included all smokers of cigars and pipes. The other two groups are cigarette smokers with a cutpoint at 40 packyears.

Two phase sampling is a standard technique that involves stratified sampling after the first phase.8 The investigator first draws a random sample to measure the covariates needed for stratification. At phase two, random subsamples of varying size are drawn from within each stratum, and the collection of data is completed for the subjects selected in this second phase. Applied to case–control studies, the first phase sample is a large case–control study considered as being cross classified by disease status and a categorical exposure classification, thus defining the strata. From within each stratum (disease × exposure), members are selected for complete or more precise covariate ascertainment. Specific analytical methods of fitting unconditional logistic regression models appropriately utilise data from both phases of sampling.

In the present study, we considered our initial data set as being divided into 16 strata resulting from the cross classification of disease status, the categorised duration of asbestos exposure, and a binary smoking classification (heavy smokers (more than 20 packyears) versus light smokers and non-smokers). We assumed further that all subjects of our validation sample were independently sampled from these 16 strata. As these strata include the disease status, the matching structure is of course lost. The two phase logistic regression combines information from the original case–control study, as contained in the 16 strata frequencies, with the information on fibreyears in the validation sample. This leads to increased precision of the fibreyear parameters over an analysis using the validation sample alone.

As mentioned in the introduction, we analysed the data in three ways. Firstly, the original data set was reanalysed, fitting smoking (in four categories) and duration of exposure (lifelong hours of asbestos expoure in four categories). However, results for duration of exposure are not of prime interest. Therefore, the second analysis is a logistic regression of the validation sample alone, focusing on the effect of fibreyears (in four categories as well as a continuous variable). The third analysis considered the validation sample as a second phase sample of a two phase case–control study.

It has to be stressed that the two phase paradigm cannot make use of a close matching structure. Therefore the first two analyses were carried out with both conditional and unconditional regression: (a) to show that the parameter estimates differ only to a small extent; and (b) to obtain a basis for a comparison of the unconditional results with those from the two phase logistic regression.

Odds ratios and 95% confidence intervals were calculated using the SAS procedures LOGISTIC for the unconditional logistic regression model and PHREG for the conditional analysis, respectively.9 We fitted the two phase logistic regression by maximum likelihood using the EM algorithm.10 The latter was programmed using the SAS procedure IML by one of us (WS).11


Table 1 shows the results of the conditional and the unconditional logistic regression analysis for the original case–control study with categorical covariates duration of asbestos exposure and smoking, included in a single model. The results of the two regressions only differ to a small extent: the smoking parameters were lower in the conditional regression and the asbestos duration parameters somewhat higher. When adjusting for age (continuous as well as categorical) and region in the unconditional regression, the risk estimates for smoking and asbestos remained virtually unchanged (results not shown). Therefore the matching variables could be disregarded in this study. Despite slightly different smoking parameterisation, these results are very close to those already published.1

Table 1

Estimated regression coefficients (log OR), standard errors (SE), odds ratios (OR), and 95% confidence intervals (CI) for smoking behaviour and duration of occupational asbestos exposure in the entire study group of 839 male lung cancer cases and 839 male population controls (as results of conditional and unconditional logistic regression models including all parameters)

As expected, duration of exposure, as assessed in the original case–control study, is closely related to the expert estimation of cumulative exposure (see fig 1). This was especially the case for the non-exposed group. Of the 183 subjects included in the validation sample for which no asbestos exposure was found in the initial assessment (solely based on the SQs), only 10 subjects were assigned some asbestos exposure by the experts based on all questionnaire information. Conversely, nine of 182 subjects had no fibreyears but some asbestos duration. Considering the expert assessment as the gold standard, this leads to a sensitivity of 93% and a specificity of 95%.

Figure 1

Duration of asbestos exposure calculated as lifelong working hours (based on SQ) cross classified by cumulative asbestos exposure in fibreyears (based on the expert rating) in the validation sample of 164 male lung cancer cases and 164 male population controls.

Table 2 presents the conditional and unconditional regression analyses of the validation sample and cross classification of the case–control status with smoking and fibreyears. Note that five of 18 non-smoking cases (28%) were included compared to 26 of 138 non-smoking controls (19%). This results in lowered odds ratios for smoking which can be read by comparing the unconditional analyses in tables 1 and 2. With regard to asbestos exposure, the only odds ratio (OR) substantially larger than one is the OR in the highest exposure category (>10 fibreyears), which does not reach statistical significance. If a quantitative model is fitted using fibreyears transformed by the natural logarithm, a positive trend is detected but not statistically significant. Again the results presented here are close to those obtained by conditional logistic regression, taking the matching into account. As in the total sample, the parameter estimates were virtually unchanged when adjusting for age. As before, the conditional analysis shows slightly larger exposure effects and the trend with logtransformed fibreyears is of borderline statistical significance. When considering interaction with smoking, fitted as logtransformed fibreyears among smokers of more than 40 packyears, this trend is larger than among non and light smokers, although not significantly so.

Table 2

Estimated regression coefficients (log OR), standard errors (SE), odds ratios (OR), and 95% confidence intervals (CI) for smoking behaviour and occupational asbestos exposure in the validation of 164 male lung cancer cases and 164 male population controls (as results of conditional and unconditional logistic regression models); the first model presents the simultaneous fit of categorised smoking and fibreyears, the second categorised smoking (not shown) and continuous ln(fibreyears+1)

Table 3 shows the results of the two phase analysis. The ORs concerning smoking are very close to the preceding (unconditional) analysis and the ORs with regard to fibreyears are only slightly different. However, the standard errors of these estimates are much smaller as the analysis borrows much information from the original case–control study. In our study this has the consequence that the OR for the highest exposure category is significantly greater than 1.

Table 3

Estimated regression coefficients (log OR), standard errors (SE), odds ratios (OR), and 95% confidence intervals (CI) for smoking behaviour and occupational asbestos exposure by applying the two phase paradigm (estimation by the two phase (unconditional) logistic regression); the first model presents the simultaneous fit of categorised smoking and fibreyears, the second categorised smoking (not shown) and continuous ln(fibreyears+1)

The parameter for trend is also increased (compared to the unconditional analysis of table 2); it is not only significant but its confidence interval is narrower than in the preceding analysis, thus again illustrating the gain in efficiency obtained from the two phase analysis. The trend for untransformed fibreyears was also fitted. However, when comparing the model predictions of untransformed fibreyears (OR = 1.00, 1.03, and 1.18 respectively for 0.5, 5, and 25 fibreyears) and logtransformed fibreyears (OR = 1.07, 1.34, and 1.71 respectively for 0.5, 5, and 25 fibreyears) with the categorised ORs from tables 2 or 3, one can see that the logtransformed fibreyears give a much better fit.


Our stated aim in this study was to assess the quantitative effect of a cumulative asbestos exposure, where the exposure is assessed by experts. We did this in two ways: firstly by simply analysing the part of the case–control study in which this fibreyear information was collected (the validation sample); and secondly, by analysing this validation sample together with the original case–control study in the framework of the two phase paradigm.

One application of this quantitative estimation is to assess the hypothesis of a twofold risk with a cumulative exposure of 25 fibreyears, which is the basis of the German compensation scheme. When using the parameterisation in log (fibreyears+1) as computed from the two phase analysis, the corresponding OR is 1.71 (95% CI: 1.18 to 2.46). When the risk was computed directly with a fifth category indicator for the subjects exposed more than 25 fibreyears, the corresponding OR was estimated at 1.73 (95% CI: 0.85 to 3.53). This latter OR was lower than the OR for 10+ fibreyears (table 3) as the category between 10 and 25 fibreyears contained seven cases and only two controls. However, both estimates of the risk are consistent with the stated hypothesis.

This result cannot of course be interpreted as the effect of the actual cumulative asbestos dose inhaled, as the expert assessment is necessarily imprecise in the absence of any measurement. However, the expert rating tries to mimic the assessment which would possibly be used in compensation claims. Furthermore, as some jobs, which are traditionally associated with exposure to asbestos, may have received closer attention than others, there may be a tendency to underestimate exposures in individuals with a relatively low exposure and, conversely, to overestimate in those with job titles inherently implying exposure. However, this exposure misclassification should not be differential with respect to the case–control status, and should therefore lead to attenuation of the dose–response effect.

The study was not initially designed as a two phase study. This has no consequence on the inference, given that the so called missing at random assumption holds. Here this means that the statistical distribution of fibreyears in subjects selected into the validation sample is the same as for subjects in the same stratum (defined by duration of exposure, smoking, and disease status) not selected in the validation sample. As the validation sample consists of all cases and corresponding controls enrolled during two years, this assumption is reasonable.

The choice of the stratification of the original case–control study, which is necessary for a two phase analysis, had to be made at the analysis stage. We chose this stratification in order to use as much information as possible concerning the parameter of interest (cumulative asbestos exposure), therefore including the duration of this exposure in four categories while stratifying on the main confounder (smoking) in two categories. A more appropriate stratification would have taken all the four smoking categories. Unfortunately no non-smoking case in the highest asbestos duration category was available in the case–control study, preventing us from applying the two phase method using the more appropriate stratification.

The two phase analysis does not allow for the matching structure. However, given the fact that each case selected for the expert exposure assessment was included with his control, this sample remains balanced with respect to age and region, which were the two matching variables used. The fact that age has no influence was confirmed in a two phase analysis, with age as an added independent factor. This is because, in the original case–control study, age is uncorrelated with duration of asbestos exposure and packyears and, in the validation sample, age is uncorrelated with fibreyears and with packyears. The region might nevertheless be a slight confounder as the two phase analysis draws information from the correlation between duration and fibreyears. This correlation might be slightly different in the Frankfurt area, from which no subject was included in the validation sample, than in the Bremen area. However, the results excluding cases and controls from the Frankfurt area, were only marginally different (results not shown). Furthermore tables 2 and 3, which present both the conditional and the unconditional logistic models, show that there are only small differences in the estimates by both methods.

Main message

  • The two phase paradigm allows a more precise estimate of the effect of asbestos on lung cancer compared to an analysis of the validation sample alone.

Comparison of tables 2 and 3 shows that increased efficiency can be gained from a two phase design. However, the sampling strategy in obtaining the validation sample was not optimal, as such a design would oversample the sparser strata, for instance by including all non-smoking cases.12 Further research is ongoing on how to plan efficient two phase designs.

We know of no other published lung cancer case–control study which modelled the effect of fibreyears, to which our estimates could be compared. In contrast, there are several cohort studies which provide sufficient details (measurements) to allow a quantitative evaluation of the lung cancer risk owing to cumulative asbestos exposure. These studies, as well as reviews from these studies, are discussed in Boffetta,13 who summarised these findings and suggested a linear relation between cumulative asbestos exposure and lung cancer risk; he concluded that the most widely accepted relationship is “an increase of 1% of the risk of lung cancer for each fb/ml-yr of exposure”. However, some cohort studies also found risks which are in very good agreement with our results; for example, a study of South Carolina textile workers,14, 15 which suggested a doubling dose of approximately 30 fibreyears, also using a linear effect model on the multiplicative scale. However, the fit of such a model to our data was poor. The presented log fibreyear model fits the data better.

With regard to an expert based assessment of cumulative asbestos exposure in a case–control study, the only published study concerned a mesothelioma case–control study16 in which a similar, but steeper, dose–response relation was observed.


We would like to acknowledge Wolfgang Römer and Rolf Arhelger from the Institute and Outpatient Clinic for Occupational and Social Medicine, Justus-Liebig-University, Gieβen for carrying out the expert rating. The study was supported financially by the Federal Ministry of Research and Technology (BMFT), grant no. 01HK546, and the Federal Ministry of Labour and Social Affairs (BMHS), grant no. III67-27/13. Most of Pascal Wild's work on this paper was carried out during his research leave at the BIPS.

Policy implication

  • The results of this study are consistent with a doubling of the lung cancer risk with a 25 fibreyears asbestos exposure.