Study methods & designValidity and Coverage of Estimates of Relative Accuracy
Introduction
Sensitivity and specificity of a test can be estimated in a study only if disease status of all subjects is independently confirmed using a gold standard. A gold standard, however, rarely can be administered to all subjects because of ethical or cost considerations. Studies comparing two tests often restrict confirmation of disease status to subjects classified as positive by either test. In this context, absolute measures of accuracy (sensitivity and specificity) cannot be readily estimated because the confirmatory procedure does not lead to an assessment of the disease status of all subjects 1, 2, 3. Relative measures of test accuracy, such as relative sensitivity (the ratio of two sensitivities, RSN) and relative false-positive rate (the ratio of two false-positive rates, RFP), however, can be estimated in these circumstances 3, 4. We have described a simple asymptotic variance estimator that can be used to calculate the CI of RSN and RFP (4).
In this article, we investigate the effect of disease prevalence, sample size, test accuracy, and interdependence between tests on the validity of RSN and RFP estimates and on the coverage of their CI.
Section snippets
Background
We focus on self-matched study design (i.e., the two tests to be compared are simultaneously administered to the same individuals) and dichotomous disease status and test results (Table 1).
In Table 1, ND and ND̄ are the unknown numbers of diseased and nondiseased subjects. The true disease status of test-positive subjects is ascertained using a confirmatory procedure (a gold standard) (Tables 1B and 1C).
For data in Tables 1B and 1C, estimates of RSN, RFP, and 95% CI of lnRSN and lnRFP are
Results
The simulation work yielded 6,400,000 empirical sets of tables with the same layout as Tables 1B and 1C. One or both marginal totals were 0 in 122 sample tables with a layout similar to Table 1B. Zero marginals were also observed in 101,896 sample tables with a layout similar to Table 1C. Thus, lnRSN or lnRFP and their variance estimates could not be computed for such tables, and the samples were deleted from the analysis of bias and coverage.
All the 122 data sets of the form of Table 1B with
Example
We applied the methods described in this paper to breast cancer screening data from the New York Health Insurance Plan study (6). Based on that study, Schatzkin and coauthors described the estimation of RSN and RFP (3). In the New York Health Insurance Plan study, 20,211 women aged 40–64 years were screened for breast cancer. Breast biopsies were obtained by either mammography or physical examination from 307 women who were positive for a suspicious breast lesion. In 55 cases, the lesion was
Discussion
RSN and RFP are two important measures of test accuracy only when confirmation of test positive results is feasible. Because of ethical and practical considerations, this study design has been used in a variety of settings, such as screening for breast (6) colorectum 9, 10, and prostate (11) cancers, and for the diagnosis or screening of sexually transmitted diseases such as Chlamydia trachomatis infection 12, 13. Thus, the methods described in this paper are applicable to a broad range of
References (14)
- et al.
Update of the Swedish two-county program of mammographic screening for breast cancer
Radiol Clin North Am.
(1992) - et al.
Basic issues in population screening for cancer
J Natl Cancer Inst.
(1980) Screening in Chronic Disease
(1992)- et al.
Comparing new and old screening tests when a confirmatory procedure cannot be performed on all screenees. Example of automated cytometry for early detection of cervical cancer
Am J Epidemiol.
(1987) - et al.
Comparison of the accuracy of two tests with a confirmatory procedure limited to positive results
Epidemiology.
(1997) SAS® LanguageReference, Version 6
(1990)- et al.
Mammography and clinical examination in mass screening for cancer of breast
Cancer.
(1967)
Cited by (12)
A new method to address verification bias in studies of clinical screening tests: Cervical cancer screening assays as an example
2014, Journal of Clinical EpidemiologyCitation Excerpt :Although extension of the proposed approach to address these more sophisticated situations of disease verification is of interest, the development of a simple and straightforward computational method for this purpose will be a challenge. Another assumption of our model involves the large sample normal approximation, and thus, the proposed method may not be appropriate for small studies [39,40]. We developed a straightforward approach to estimate and compare the accuracy of two or more screening tests in the presence of verification bias, which can be readily implemented with standard software.
Fasting criteria for screening: Test properties and agreement with glucose tolerance
2002, Diabetes Research and Clinical PracticeA general latent class model for performance evaluation of diagnostic tests in the absence of a gold standard: An application to Chagas disease
2012, Computational and Mathematical Methods in MedicinePerformance of chest radiograph and CT scan for lung cancer screening in asbestos-exposed workers
2009, Occupational and Environmental Medicine