Skip to main content
Log in

Influence Function Based Variance Estimation and Missing Data Issues in Case-Cohort Studies

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • C. C. Abnet, C. B. Borkowf, Y. L Qiao, P. S. Albert, E. Wang, A. H. Merrill, S. D. Mark, Z. W. Dong, P. R. Taylor S. M. Dawsey, “Sphingolipids as biomarkers of fumonisin exposure and risk of esophageal squamous cell carcinoma,” To appear Cancer Epidemiology Biomarkers and Prevention, 2001.

  • P. K. Andersen, Ø. Borgan, R. D. Gill, and N. Keiding, Statistical Models Based on Counting Processes, Springer-Verlag: New York, NY, 1991.

    Google Scholar 

  • W. E. Barlow, “Robust variance estimation for the case-cohort design,” Biometrics, vol 50 pp. 1064–1072, 1994.

    Google Scholar 

  • O. Borgan, B. Langholz, S. O. Samuelsen, L. Goldstein and J. Pogoda, “Exposure stratified case-cohort designs,” Lifetime Data Analysis, vol 6 pp. 39–58, 2000.

    Google Scholar 

  • O. Borgan, L. Goldstein and B. Langholz, “Methods for the analysis of sampled cohort data in the cox proportional hazards model,” The Annals of Statistics, vol 23 pp. 1749–1778, 1995.

    Google Scholar 

  • K. C. Cain and N. T. Lange, “Approximate case influence for the proportional hazards regression model in censored data,” Biometrics, vol 40 pp. 493–499, 1984.

    Google Scholar 

  • Helicobacter and Cancer Collaborative Group, “Gastric cancer and Helicobacter Pylori: a combined analysis of eleven case-control studies nested within prospective cohorts,” Gut, vol 3 pp. 347–353, 2001.

    Google Scholar 

  • P. J. Huber, Robust Statistical Procedures, Society for Industrial and Applied Mathematics: Philadelphia, PA, 1977.

    Google Scholar 

  • J. D. Kalbfleisch and J. F. Lawless, “Likelihood analysis of multi-state models for disease incidence and mortality,” Statistics in Medicine, vol 7 pp. 149–160, 1988.

    Google Scholar 

  • S. Kim and V. De Gruttola, “Strategies for cohort sampling under the Cox proportional hazards model, application to an AIDS clinical trial,” Lifetime Data Analysis, vol. 5 pp. 149–172, 1999.

    Google Scholar 

  • P. J. Limburg, C. Q. Wang, S. D. Mark, Y. L. Qiao, G. I. Perez-Perez, M. J. Blaser, P. R. Taylor, Z. W. Dong, S. M. Dawsey, “Helicobacter pylori seropositivity: Association with increased gastric cardia and noncardia cancer risks in Linxian, China.” Journal of the National Cancer Institute, 93, pp. 226–233, 2001.

    Google Scholar 

  • D. Y. Lin and L. J. Wei, “The robust inference for the Cox proportional hazards model,” Journal of the American Statistical Association, vol 84 pp. 1074–1078, 1989.

    Google Scholar 

  • Y. Lin and Z. Ying, “Cox regression with incomplete covariate measurements,” Journal of the American Statistical Association, vol 88 pp. 1341–1349, 1993.

    Google Scholar 

  • S. D. Mark, Y. L., S. M. Dawsey, H. Katki, E. W. Gunter, W. Yan-Ping, J. F. Fraumeni, W. J. Blot, Z. W. Dong, P. R. Taylor, “Higher serum selenium is associated with lower esophageal and gastric cardia cancer rates.” Journal of the National Cancer Institute, vol 92 pp. 1753–1763, 2000.

    Google Scholar 

  • H. Moller, E. Heseltine, H. Vainio. “Working group report on schistosomes, liver flukes and Helicobacter pylori.” International Journal of Cancer, vol 60 pp. 587–589, 1994.

    Google Scholar 

  • R. L. Prentice, “A case-cohort design for epidemiologic cohort studies and disease prevention trials,” Biometrika, vol. 73 pp. 1–11, 1986.

    Google Scholar 

  • M. G. Pugh, Inference in the Cox Proportional Hazards Model with Missing Covariate Data, thesis, Harvard School of Public Health: Boston, MA, 1993.

    Google Scholar 

  • N. Reid and H. Crepeau, “Influence functions for proportional hazards regression,” Biometrika, vol 72 pp. 1–9, 1985.

    Google Scholar 

  • J. M. Robins, A. Rotnitsky, and L. P. Zhao, “Estimation of regression coefficients when some regressors are not always observed,” Journal of the American Statistical Association, vol 89 pp. 846–866, 1994.

    Google Scholar 

  • S. G. Self and R. L. Prentice, “Asymptotic distribution theory and efficiency results for case-cohort studies,” The Annals of Statistics, vol. 16 pp. 64–81, 1988.

    Google Scholar 

  • T. M. Therneau and H. Li, “Computing the Cox Model for Case Cohort Designs,” Lifetime Data Analysis, vol 5 pp. 99–112, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mark, S.D., Katki, H. Influence Function Based Variance Estimation and Missing Data Issues in Case-Cohort Studies. Lifetime Data Anal 7, 331–344 (2001). https://doi.org/10.1023/A:1012533130596

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1012533130596

Navigation