Some issues in estimating the effect of prognostic factors from incomplete covariate data

W Vach

doi:10.1002/(sici)1097-0258(19970115)16:1<57::aid-sim471>3.0.co;2-s

Some issues in estimating the effect of prognostic factors from incomplete covariate data

Stat Med. 1997;16(1-3):57-72. doi: 10.1002/(sici)1097-0258(19970115)16:1<57::aid-sim471>3.0.co;2-s.

Author

W Vach¹

Affiliation

¹ Center for Data Analysis, University of Freiburg, Germany.

PMID: 9004383
DOI: 10.1002/(sici)1097-0258(19970115)16:1<57::aid-sim471>3.0.co;2-s

Abstract

In evaluating prognostic factors by means of regression models, missing values in the covariate data are a frequent complication. There exist statistical tools to analyse such incomplete data in an efficient manner, and in this paper we make use of the traditional maximum likelihood principle. As well as an analysis including the incompletely measured covariates, such tools also allow further strategies of data analysis. For example, we can use surrogate variables to improve the prediction of missing values or we can try to investigate a questionable "missing at random' assumption. We discuss these techniques using the example of a clinical study where one important covariate is missing for about half the subjects. Additionally we consider two further issues: evaluation of differences between estimates from a complete case analysis and analyses using all subjects and assessment of the predictive value of missing values.

MeSH terms

Data Interpretation, Statistical
Humans
Likelihood Functions
Models, Statistical*
Neoplasm, Residual / diagnosis
Neoplasm, Residual / surgery
Predictive Value of Tests
Prognosis
Randomized Controlled Trials as Topic / methods*
Regression Analysis
Sensitivity and Specificity
Stochastic Processes
Tomography, X-Ray Computed