Exploiting gene-environment independence for analysis of case-control studies: an empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency

Biometrics. 2008 Sep;64(3):685-694. doi: 10.1111/j.1541-0420.2007.00953.x. Epub 2007 Dec 20.

Abstract

Standard prospective logistic regression analysis of case-control data often leads to very imprecise estimates of gene-environment interactions due to small numbers of cases or controls in cells of crossing genotype and exposure. In contrast, under the assumption of gene-environment independence, modern "retrospective" methods, including the "case-only" approach, can estimate the interaction parameters much more precisely, but they can be seriously biased when the underlying assumption of gene-environment independence is violated. In this article, we propose a novel empirical Bayes-type shrinkage estimator to analyze case-control data that can relax the gene-environment independence assumption in a data-adaptive fashion. In the special case, involving a binary gene and a binary exposure, the method leads to an estimator of the interaction log odds ratio parameter in a simple closed form that corresponds to an weighted average of the standard case-only and case-control estimators. We also describe a general approach for deriving the new shrinkage estimator and its variance within the retrospective maximum-likelihood framework developed by Chatterjee and Carroll (2005, Biometrika92, 399-418). Both simulated and real data examples suggest that the proposed estimator strikes a balance between bias and efficiency depending on the true nature of the gene-environment association and the sample size for a given study.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adenoma / etiology
  • Adenoma / genetics
  • Bayes Theorem
  • Bias
  • Biometry
  • Case-Control Studies
  • Colorectal Neoplasms / etiology
  • Colorectal Neoplasms / genetics
  • Databases, Factual
  • Environmental Exposure*
  • Epidemiologic Methods*
  • Female
  • Genotype*
  • Humans
  • Likelihood Functions
  • Logistic Models
  • Odds Ratio
  • Ovarian Neoplasms / etiology
  • Ovarian Neoplasms / genetics
  • Retrospective Studies
  • Stochastic Processes