Methods Inf Med 2010; 49(05): 421-425
DOI: 10.3414/ME10-01-0026
Original Articles
Schattauer GmbH

Generalized Estimating Equations

Notes on the Choice of the Working Correlation Matrix
A. Ziegler
1   Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Lübeck, Germany
,
M. Vens
1   Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Lübeck, Germany
› Author Affiliations
Further Information

Publication History

received: 29 March 2010

accepted: 12 May 2010

Publication Date:
17 January 2018 (online)

Summary

Background: Generalized estimating equations (GEE) are an extension of generalized linear models (GLM) in that they allow adjusting for correlations between observations. A major strength of GEE is that they do not require the correct specification of the multivariate distribution but only of the mean structure.

Objectives: Several concerns have been raised about the validity of GEE when applied to dichotomous dependent variables. In this contribution, we summarize the theoretical findings concerning efficiency and validity of GEE.

Methods: We introduce the GEE in a formal way, summarize general findings on the choice of the working correlation matrix, and show the existence of a dilemma for the optimal choice of the working correlation matrix for dichotomous dependent variables.

Results: Biological and statistical arguments for choosing a specific working correlation matrix are given. Three approaches are described for overcoming the range restriction of the correlation coefficient.

Conclusions: The three approaches described in this article for overcoming the range restrictions for dichotomous dependent variables in GEE models provide a simple and practical way for use in applications.

 
  • References

  • 1 Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika 1986; 73: 13-22.
  • 2 Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986; 42: 121-130.
  • 3 Ziegler A, Kastner C, Blettner M. The generalised estimating equations: an annotated bibliography. Biom J 1998; 40: 115-139.
  • 4 Gourieroux C, Monfort A. Statistics and econometric models. Cambridge: Cambridge University Press; 1995
  • 5 Mancl LA, Leroux BG. Efficiency of regression estimates for clustered data. Biometrics 1996; 52: 500-511.
  • 6 Wang Y-G, Carey V. Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance. Biometrika 2003; 90: 29-41.
  • 7 Pan W, Louis TA, Connett JE. A note on marginal linear regression with correlated response data. Am Stat 2002; 54: 191-195.
  • 8 Pepe MS, Anderson GL. A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data. Commun Statist Simula 1994; 23: 939-951.
  • 9 Dahmen G, Ziegler A. Independence estimating equations for controlled clinical trials with small sample sizes. Methods Inf Med 2006; 45: 430-434.
  • 10 Emrich LJ, Piedmonte MR. On some small sample properties of generalized estimating equation estimates for multivariate dichotomous outcomes. J Statist Comput Simul 1992; 41: 19-29.
  • 11 Chaganty NR, Joe H. Efficiency of generalized estimating equations for binary responses. J Roy Statist Soc B 2004; 66: 851-860.
  • 12 McDonald BW. Estimating logistic-regression parameters for bivariate binary data. J Roy Statist Soc B 1993; 55: 391-397.
  • 13 Prentice RL. Correlated binary regression with covariates specific to each binary observation. Bio-metrics 1988; 44: 1033-1048.
  • 14 Hanley JA, Negassa A, Edwardes MD. GEE analysis of negatively correlated binary responses: a caution. Stat Med 2000; 19: 715-722.
  • 15 Crowder M. On the use of a working correlation matrix in using generalised linear models for repeated measures. Biometrika 1995; 82: 407-410.
  • 16 Pan W, Connett JE. Selecting the working correlation structure in generalized estimating equations with application to the lung health study. Stat Sin 2002; 12: 475-490.
  • 17 Cui J, Qian G. Selection of working correlation structure and best model in GEE analyses of longitudinal data. Commun Stat-Simul Comput 2007; 36: 987-996.
  • 18 Pan W. Akaike’s information criterion in generalized estimating equations. Biometrics 2001; 57: 120-125.
  • 19 Hardin JW, Hilbe JM. Generalized Estimating Equations. Chapman & Hall/CRC; 2003
  • 20 Hin LY, Wang YG. Working-correlation-structure identification in generalized estimating equations. Stat Med 2009; 28: 642-658.
  • 21 Hin L-Y, Carey V, Wang Y-G. Criteria for working-correlation-structure selection in GEE: assessment via simulation. Am Stat 2007; 61: 360-364.
  • 22 Dahmen G, Ziegler A. Generalized estimating equation on controlled clinical trials: hypotheses testing. Biom J 2004; 46: 214-232.
  • 23 Baradat P, Maillart M, Marpeau A, Slak MF, Yani A, Pastiszka P. Utility of terpenes to assess population structure and mating patterns in conifers. In: Baradat P, Adams WT, Müller-Starck G. (eds). Population Genetics and Genetic Conservation of Forest Trees. Amsterdam: Academic Publishing; 1996
  • 24 Cochran WG. Sampling Techniques. 2nd ed. New York: Wiley; 1963