Article Text

Download PDFPDF

Generalised estimating equations and low back pain
  1. E F Harkness1,
  2. E S Nahit1,
  3. G J Macfarlane1,
  4. A J Silman1,
  5. J McBeth1,
  6. G Dunn2
  1. 1Arthritis Research Campaign Epidemiology Unit, Stopford Building, Medical School, University of Manchester, Oxford Road, Manchester M13 9PT, UK; moeyjefh{at}
  2. 2Biostatistics Group, Medical School, University of Manchester
    1. W E Hoogendoorn3,
    2. P M Bongers3,
    3. H C W de Vet3,
    4. J W R Twisk3,
    5. W van Mechelen3,
    6. L M Bouter3
    1. 3Netherlands Cancer Institute, Plesmanlaan 121 Amsterdam, Netherlands; l.hoogendoorn{at}

      Statistics from

      Request Permissions

      If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

      We read with interest the article by Hoogendoorn et al who examined the use of different approaches to analysing data from their prospective cohort study of work related exposures and the future onset of low back pain.1

      Exposures and outcomes are time dependent factors as they are subject to change over time. The strength of the relation depends on the assumptions of time dependence (or independence) of exposures and outcomes. The effects of these assumptions can be investigated by adopting different modelling approaches to studies that have collected repeated measures of exposure and outcome data over time.

      Hoogendoorn et al have adopted such an approach in their study of work related risk factors for low back pain.1 Information on work related physical and psychosocial factors and low back pain outcome was collected at baseline and in three annual follow ups. They showed an increased risk of low back pain for work related mechanical factors, when using two different generalised estimating equation (GEE) models compared to the standard logistic regression approach.1 Conversely, for work related psychosocial factors the association with low back pain was weaker when the GEE method was employed. Such an approach is enlightening and we agree that it is important to explore such analytical techniques in the investigation of work related risk factors and musculoskeletal symptoms. Therefore further exploitation of this method of analysis seems appropriate.

      We have recently conducted a prospective study of new onset low back pain in 1081 newly employed workers from 12 occupational settings.2 We examined newly employed workers since studies conducted in well established workforces may be influenced by the healthy worker effect, whereby workers may have changed their job or certain aspects of their job as a result of musculoskeletal pain. In brief, at baseline subjects completed a questionnaire, including an assessment of pain status. A preshaded manikin was used to enquire about low back pain, defined as pain between the 12th rib and the gluteal folds, lasting at least 24 hours in the past month. Individuals free from low back pain at baseline were identified and followed up at 12 and 24 months. The detailed questionnaire also gathered information on a number of work related mechanical and psychosocial exposures.

      The models used for analysis were identical to those used by Hoogendoorn et al.1 The standard logistic regression model was used to examine the relation between exposures and new onset low back pain at 12 or 24 months. GEE models are used to analyse repeated measures data, by taking the within subject correlation into account, and providing a summary estimate over time. In GEE model 1, the relation between baseline exposures and new onset low back pain at 12 or 24 months was examined. In GEE model 2 that relation was examined for baseline exposures and new onset low back pain at 12 months, and 12 month exposures and new onset low back pain at 24 months.

      The two models in which risk factors were assumed to be time independent (standard logistic regression and GEE model 1) produced similar point estimates for developing new onset low back pain (tables 1 and 2), with narrower 95% confidence intervals for GEE model 1. In GEE model 2, where risk factors are assumed to be time dependent, differences were noted for only a small number of variables (carrying on one shoulder, lifting at or above shoulder level, and general health questionnaire score). In addition, the 95% confidence intervals were narrower than those derived from the standard logistic regression and GEE model 1. However, there was no consistent pattern of attenuation or growth noted in either the mechanical or psychosocial risk factors examined (tables 1 and 2).

      In summary, we agree that it is important to investigate different statistical techniques in an attempt to determine what effect the assumptions of time dependence (or independence) have on predictors of musculoskeletal pain. However, unlike the study by Hoogendoorn et al,1 our data show that the choice of model has relatively little consistent influence on the magnitude of the results, although GEEs give more accurate estimates.

      Table 1

      Work related mechanical risk factors and new onset low back pain*

      Table 2

      Work related psychosocial risk factors and new onset low back pain*


      Authors’ reply

      In response to our paper on a comparison of different approaches to the analysis of data from a prospective cohort study,1 Harkness et al performed a similar analysis on data from their two year prospective cohort study on work related exposures and new onset of low back pain. They agree that it is important to determine the effect of assumptions on time dependence on the observed relations between work related factors and low back pain. They also conclude that their data show that the choice of the model has relatively little influence on the magnitude of the associations.

      First, we are pleased with their response to our article, because this is a contribution to the discussion on the design and analysis of repeated measurements studies that we hoped to initiate. Nevertheless, we think that some elements are overlooked in the interpretation of their results.

      The most important point is that taking into account repeated measurements in their analysis can only have an influence on the magnitude of the associations observed when there is indeed variability over time in the exposure(s) and/or the outcome measure at issue. Harkness et al do not report on the variability over time in the work related mechanical and psychosocial factors, nor on the variability of low back pain in their data. This information would have been helpful in the interpretation of the results presented in tables 1 and 2. Harkness et al remark that there was no consistent pattern of attenuation or growth noted in the observed associations for either the mechanical or psychosocial risk factors examined. In our opinion, the differences between the associations observed in GEE models 1 and 2 will not necessarily show a consistent pattern as changes over time are not necessarily the same for different exposures.

      Harkness et al use exactly the same models as we did. Of course, this increases the comparability. However, it is also important to choose an appropriate temporal position for the exposure window relative to the outcome event in the model that includes time dependent measures of the exposure and the outcome (GEE model 2). Harkness et al do not argue why they use the same time lag as we did. Since their outcome measure concerns pain in the past month, and not pain in the past 12 months as in our study, use of no time lag might have been another option to consider.

      Finally, we do not understand why the authors adjusted for occupation in the analyses. In our opinion, this may introduce overadjustment because subjects from different occupational settings were included to obtain a contrast in the work related exposures studied.

      We appreciate the contribution of Harkness et al and recognise that there are still many unanswered important questions regarding the data analysis of cohort studies with multiple measurements of work related factors and musculoskeletal symptoms.